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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 . Field of the Inventions 

This invention relates to newly identified polynucleotides, polypeptides encoded by 
such polynucleotides, the use of such polynucleotides and polypeptides, as well as the 
production and isolation of such polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invention has been putatively identified as 
glucosidases, a-galactosidases, P-galactosidases, fl-mannosidases, B-mannanases, 
endoglucanases, and pullalanases. 

2. Description of Related Art 

The glycosidic bond of p-galactosides can be cleaved by different classes of 
enzymes: (i) phospho-p-galactosidases (EC3.2.1.85) are specific for a phosphorylated 
substrate generated via phosphoenolpyruvate phosphotransferase system (PTS)-dependent 
uptake; (ii) typical p-galactosidases (EC 3.2.1.23), represented by the Escherichia coli LacZ 
enzyme, which are relatively specific for (3-galactosides; and (iii) P-glucosidases (EC 
3.2.1.21) such as the enzymes of Agrobacterium faecalis, Clostridium thermocellum, 
Pyrococcus furiosus or Sulfolobus solfataricus (Day, A.G. and Withers, S.G., (1986) 
Purification and characterization of a p-glucosidase from Alcaligenes faecalis. Can. J. 
Biochem. Cell. Biol. 64, 914-922; Kengen, S.W.M., et al. (1993) Eur. J. Biochem., 213, 
305-312; Ait, N., Cruezet, N. and Cattaneo, J. (1982) Properties of p-glucosidase purified 
from Clostridium thermocellum. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. (1991) 
Evidence that p-galactosidase of Sulfolobus solfataricus is only one of several activities of 
a thermostable P-D-glycodiase. Appl. Environ. Microbiol. 57, 1644-1649). Members of 
the latter group, although highly specific with respect to the P-anomeric configuration of 
the glycosidic linkage, often display a rather relaxed substrate specificity and hydrolyze P- 
glucosides as well as P-fucosides and P-galactosides. 

1 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose 
groups on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, B-mannanases are enzymes that catalyze the hydrolysis of mannose 
groups internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. B-mannosidases hydrolyze non-reducing, 
terminal mannose residues on a mannose-containing polysaccharide and the cleavage of di- 
or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P- 1,4 linked 
mannose backbone with a-1 ,6 linked galactose side chains. The enzymes required for the 
degradation of guar are P-mannanase, P-mannosidase and a-galactosidase. p-mannanase 
hydrolyses the mannose backbone internally and P-mannosidase hydrolyses non-reducing, 
terminal mannose residues, a-galactosidase hydroiyses a-linked galactose groups. 

Galactomannan polysaccharides and the enzymes that degrade them have a variety 
of applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 
with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the 
beet sugar industry-. 20-30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of raffinose when the sugar beets 
are stored before processing and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there is 
merit to eliminating raffinose from raw beet sugar. cc-Galactosidase has also been used as 
a digestive aid to break down raffinose, stachyose, and verbascose in such foods as beans 

and other gassy foods. 

P-galactosidases which are active and stable at high temperatures appear to be 
superior enzymes for the production of lactose-free dietary milk products (Chaplin, M.F. 
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and Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge, UK). Also, several studies have demonstrated the applicability of p- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.G.I. (1988) Enzymatic synthesis of oligosaccharides. Trends 
Biotechnol. 6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide synthesis by 
enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few P-galactosidases of thermophiles have been characterized so far. Two 
genes reported are p-galactoside-cleaving enzymes of the hyperthermophilic bacterium 
Thermotoga maritima, one of the most thermophilic organotrophic eubacteria described to 
date (Huber, R., Langworthy, T.A., Konig, H., Thomm, M„ Woese, C.R., Sleytr, U.B. and 
S tetter, K.O. (1986) T. martima sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 144, 324-333) one of the 
most thermophilic organotrophic eubacteria described to date. The gene products have been 
identified as a P-galactosidase and a P-glucosidase. 

Pullulanase is well known as a debranching enzyme of pullulan and starch. The 
enzyme hydrolyzes cc-1 ,6-glucosidic linkages on these polymers. Starch degradation for 
the production or sweeteners (glucose or maltose) is a very important industrial application 
of this enzyme. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 
saccharification stage, is performed by (5-amylase with pullalanase added as a debranching 
enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 
endoglucanases of the present invention can hydrolyze the internal B-l ,4-glycosidic bonds 
in cellulose, which may be used for the conversion of plant biomass into fuels and 
chemicals. Endoglucanases also have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the fruit juice and brewing industry for 
the clarification and extraction of juices. 
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Brief Description of the Drawings 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

Figures la-b are the full-length DNA and corresponding deduced amino acid 
sequence of Ml 1TL of the present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present invention (Applied Biosystems, 
Inc.). 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of OC1/4V-33B/G. 

Figure 3 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of F1-12G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid 
sequence of 9N2-3 1 B/G. 

Figures 5a-b are the full-length DNA and corresponding deduced amino acid 
sequence of MSB8-6G. 

Figure 6 is the full-length DNA and corresponding deduced amino acid sequence 
of AEDII 1 2RA- 1 8 B/G . 

Figures 7a-b are the full-length DNA and corresponding deduced amino acid 

sequence of GC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid 
sequence of VC1-7G1. 

Figures 9a-c are the full-length DNA and corresponding deduced amino acid 

sequence of 37GP1. 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid 
sequence of 6GC2. 

Figures lla-d are the full-length DNA and corresponding deduced amino acid 
sequence of 6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid 
sequence of 63GB 1. 
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Figures 1 3a-b are the full-length DNA and corresponding deduced amino acid 

sequence of OC1/4V. 

Figures 14a-e are the full-length DNA and corresponding deduced amino acid 

sequence of 6GP3. 

Figures 15a-d are the full-length DNA and corresponding deduced amino acid 

sequence of Thermotoga maritima MSB8-6GP2. 

Figures 16a-c are the full-length DNA and corresponding deduced amino acid 
sequence of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid 
sequence otBanki gouldi 37GP4. 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid- 
sequence of Pyrococcus furiosusVC\-7EG\. 

SUMMARY OF THE INVENTION 

In a preferred embodiment of the present invention, there are provided isolated 
nucleic acids (polynucleotides) which encode mature enzymes having the deduced amino 
acid sequences of Figures 1-18 (SEQ ED NOS: 15-28 and 61-64). 

In another embodiment, the invention provides a method for producing a 
polypeptide including culturing host cells containing the polynucleotide of Figures 1 -18 and 
expressing from the host ceil a polypeptide encoded by the polynucleotide and isolating the 
polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 1 5-28 
or 61-64 and an enzyme which has at least 30 consecutive amino acid residue as an enzyme 
having an amino acid sequence set forth in SEQ ID NOS: 15-28 or 61-64. 

In yet another embodiment, the invention provides a method for generating glucose 
from soluble cell oligosaccharides which includes contacting a sample containing 
oligosaccharides with an effective amount of an enzyme selected from the group of 
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enzymes having the amino acid sequence set forth in SEQ ID NOS: 15-28 ; 61-63 and 64 

such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to 

the filing date of the present application. Nothing herein is to be construed as an 

admission that the invention is not entitled to antedate such disclosure by virtue of prior 

invention. 

Definitions 

"Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or 
ketone unit. 

"Oligosaccharide", as used herein, consist of short chains of monosaccharide units 
joined together by covaient bonds. Of these, the most abundant are the disaccharides, 
which have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many 
monosaccharide units. 

The term "gene" means the segment of DNA involved in producing a polypeptide 
chain; it includes regions preceding and following the coding region (leader and trailer) as 
well as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
polymerase will transcribe the two coding sequences into a single mRNA, which is then 
translated into a single polypeptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by recombinant DNA 
techniques; Le., produced from cells transformed by an exogenous DNA construct encoding 
the desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular 
enzyme, is a DNA sequence which is transcribed and translated into an enzyme when 
placed under the control of appropriate regulatory sequences. 
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Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified 
as glucosidases. a-galactosidases ? P-galactosidases, B-mannosidases, fl-mannanases, 
endoglucanases, and pullalanases as a result of their enzymatic activity. 

In accordance with one aspect of the present invention, there are provided novel 
enzymes, as well as active fragments, analogs and derivatives thereof. 

In accordance with another aspect of the present invention, there are provided 
isolated nucleic acid molecules encoding the enzymes of the present invention including 
mRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for producing such polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the food processing industry, the 
pharmaceutical industry, for example, to treat intolerance to lactose, as a diagnostic reporter 
molecule, in com wet milling, in the fruit juice industry, in baking, in the textile industry 
and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose residues. Further 
polysaccharides such as galactomannan and the enzymes according to the invention thai 
degrade them have a variety of applications. Guar gum is commonly used as a thickening 
agent in food and also is utilized in hydraulic fracturing in oil and gas recovery. 
Consequently, mannanases are industrially relevant for the degradation and modification 
of guar sums. Furthermore, a need exists for thermostable mannases that are active in 
extreme conditions associated with drilling and well stimulation. 
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In accordance with yet a farther aspect of the present invention, there are also 
provided nucleic acid probes comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research, for example, to generate probes for identifying 
similar sequences which might encode similar enzymes from other organisms by using 
certain regions, i.e.. conserved sequence regions, of the nucleotide sequence. 

These and other aspects of the present invention should be apparent to those skilled 
in the art from the teachings herein. 

The polynucleotides of this invention were originally recovered from genomic gene 
libraries derived from the following organisms: 

M11TL is a new species of Desulfurococcus isolated from Diamond Pool in 
Yellowstone National Park. Tne organism grows optimally at 85-88 : C, pH 7.0 in a low salt 
medium containing yeast extract, peptone, and gelatin as substrates with a N 2 /CO : gas 
phase. 

OC1/4V is from the genus Thermotoga. The organism was isolated from 
Yellowstone National Park. It grows optimally at 75 °C in a low salt medium with cellulose 
as a substrate and N : in gas phase. 

Pyrococcus furiosus VC1 and (7EG1) is from the genus Pyrococcus. VC1 was 

isolated from Vulcano, Italy. It grows optimally at 100°C in a high salt medium (marine) 
containing elemental sulfur, yeast extract, peptone and starch as substrates and N : in gas 
phase. 

Staphylothermus marinus Fl is a from the genus Staphylothermus. Fl was isolated 
from Vulcano, Italy. It grows optimally at 85 3 C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates and N 2 in gas phase. 
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Thermococcus 9N-2 is from the genus Thermococcus 9N-2 was isolated from 
diffuse vent fluid in the East Pacific Rise. It is a strict anaerobe that grows optimally at 
87°C. 

Thermotoga maritima MSB 8 and MSB8 (Clone # 6GP2 and 6GB4) is from the 
genus Ttwrmotogo, and was isolated from Vulcano, Italy. MSB 8 grows optimally at 85 °C, 
pH 6.5 in a high salt medium (marine) containing starch and yeast extract as substrates and 
N 2 in gas phase. 

Thermococcus alcaliphilus AEDII12RA is from the genus Thermococcus. 
AEDII12RA grows optimally at 85 °C, pH 9.5 in a high salt medium (marine) containing 
polysulfides and yeast extract as substrates and N : in gas phase. 

Thermococcus chitonophagus GC74 is from the genus Thermococcus. GC74 grows 
optimally at 85 °C, pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and N\ in gas phase. AEPII la grows 
optimally at 85°C at pH 6.5 in marine medium under anaerobic conditions. It has many 
substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by 
the organism from which they were isolated, and are sometimes hereinafter referred to as 
"Ml 1TL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" (Figure 2 and SEQ ID 
NOS:2 and 16) t "F1-12G" (Figure 3 and SEQ ID NOS:3 and 17), "9N2-31B/G" (Figure 4 
and SEQ ID NOS:4 and 18), "MSB8" (Figure 5 and SEQ ID NOS:5 and 19), "AEDII12RA- 
18B/G" (Figure 6 and SEQ ID NOS:6 and 20), "GC74-22G" (Figure 7 and SEQ ID NOS:7 
and 21), "VC1-7G1" (Figure 8 and SEQ ID NOS:8 and 22), "37GP1" (Figure 9 and SEQ 
ID NOS: 9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24), "6GP2" (Figure 1 1 
and SEQ ID NOS:ll and 25), "AEPII la" (Figure 12 and SEQ ID NOS:12 and 26), 
"OC1/4V" (Figure 13 and SEQ ID NOS:13 and 27), and "6GP3" (Figure 14 and SEQ ID 
NOS:28), "MSB8-6GP2" (Figure 15 and SEQ IDNOS:57 and 61), "MSB8-6GB4"(Figure 
16 and SEQ ID NOS:58 and 62),"VCl-7EGl"(Figure 17 and SEQ ID NOS:59 and 63), 
and 37GP4 (Figure 18 and SEQ ID NOS:60 and 64). 
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The polynucleotides and polypeptides of the present invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table 1. 

Table 1 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Acid 
Identity 


Ml 1TL-29G 


oUilOlODUS SUlIalciriCUD 

DSM 1616/P1.P- 


51% 


55% 


OC1/4V-33B/G 


Caldocellum 

c a r* r* K n ro ] vT 1 P 1 1 m ( i - 

slucosidase 


52% 


57% 


Staphylothermus 
morinus F 1 - 1 ^G 


Bacillus polymyxa, p- 
ealactosidase 


36% 


48% 


Thermococcus 9N2- 
31B/G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, p- 
ealactosidase 


51% 


50% 


Thermotoga maritime! 
MSB8-6G 


Clostridium thermocellum 
belB 


45% 


53% 


Thermococcus 
AEDII12RA-18B/G 


Bacillus polymyxa, P- 
salactosidase 


34% 


48% 


Thermococcus 
chitonophagus GC74- 
22G 


Sulfolobus sulfataricus. 
ATCC 49255/MT4,p- 
salactosidase 


46% 


54% 
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jryfULULLUj J in tyjowo 

VC1-7G1 


Sulfolobus 
sulfataricus/MT-4 P» 
ealactosidase 


46.4% 


52.5% 


Thermotoga maritima 
(6GC2) 


Pediococcus pentosaceaus 
a-galactosidase 


49% 


29% 


Thermotoga maritima 

R.mnnn,ina<;p f6GP^ 


Aspergillus aculeatus 
mannanase 


56% 


37% 


AEPII la 13- 

mannosidase C63GB1) 


Sulfolobus solfactaricus 6- 
galactosidase 


78% 


56% 


0C1/4V 

S-x /-J /-x ] 1 I o CP 

cUGOglUConabC 

(33GP1) 


Clostridium thermoceilum 
endo- 1 ,4-B-endoalucanase 


65% 


43% 


Thermotoga maritima 
pullalanase (6GP3) 


Caldocellum 
saccharolyticum a- 
destrom 6 
elucanohvdralase 


72 


53 


Bankia gouidi mix 

Endoglucanase 

(37GP1) 


None available 







The polynucleotides and enzymes of the present invention show homology to each 
other as shown in Table 2. 
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Table 2 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identitv 


Nucleic 

Acid 
Identity 


Staphylothermus 
marimis F1-12G 


Thermococcus 
AEDII12RA-18B/G, p- 
ealactosidase, elucosidase 


55% 


57% 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-glucosidase l 


74% 


66% 


Pyrococcus furiosus 
VC1-7G1 


Pyrococcus furiosus VC1 - 
7B/G P-galactosidase 


46.4% 


54% 



All the clones identified in Tables 1 and 2 encode polypeptides which have a- 
glycosidase or p-giycosidase activity. 

This invention, in addition to the isolated nucleic acid molecules encoding the 
enzymes of the present invention, also provide substantially similar sequences. Isolated 
nucleic acid sequences are substantially similar if: (i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ ID NOS: 1-14 and 57-60; 
(ii) or they encode DNA sequences which are degenerate to the polynucleotides of SEQ ID 
NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences of 
SEQ ID NOS: 15-28 and 61-64, but have variations in the nucleotide coding sequences. As 
used herein, substantially similar refers to the sequences having similar identity to the 
sequences of the instant invention. Tne nucleotide sequences that are substantially the same 
can be identified by hybridization or by sequence comparison. Enzyme sequences that are 
substantially the same can be identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the 
present invention is to probe a gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current Protocols in Molecular Biology, 
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Ausubel F.M. et al (EDS.) Green Publishing Company Assoc. and John Wiley Interscience, 
New York, 1989, 1992). It is appreciated to one skilled in the art that the polynucleotides 
of SEQ ID NOS: 1-14 and 57-60 or fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probe.. Other particular useful probes for this purpose 
are hybridizable fragments to the sequences of SEQ ID NOS: 1-14 and 57-60 (i.e., 
comprising at least 12 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 
oligonucleotide hybridization, a polymer membrane containing immobilized denatured 
nucleic acids is fust prehybridized for 30 minutes at 45 °C in a solution consisting of 0.9 M 
NaCL 50 mM NaH : P0 4 , pH 7.0. 5.0 mM N&EDTA, 0.5% SDS, 10X Denhardt's, and 0.5 
mg/ml polyriboadenylic acid. Approximately 2 X 10 7 cpm (specific activity 4-9 X it) 
cpm/ug) of 32 P end-labeled oligonucleotide probe are then added to the solution. After 12- 
16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX 
SET (1 50 mM NaCL 20 mM Tris hydrochloride, pH 7.8, 1 mM Na.EDTA) containing 0.5% 
SDS, followed by a 30 minute wash in fresh IX SET at Tm 10°C for the oligonucleotide 
probe. The membrane is then exposed to autoradiographic film for detection of 
hybridization signals. 

Stringent conditions means hybridization will occur only if there is at least 90% 
identity, preferably at least 95% identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps 
in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al, Molecular Cloning, A Laboratory Manual, 2d Ed, Cold Spring Harbor 
Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity 
with the 100 bps sequence from which it is obtained. 

As used herein, a first DN A (RNA) sequence is at least 70% and preferably at least 
80% identical to another DNA (RNA) sequence if there is at least 70% and preferably at 
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least a 80% or 90% identity, respectively, between the bases of the first sequence and the 
bases of the another sequence, when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ID NOS: 1-14 
and 57-60). For example, a polynucleotide which is at least 90% identical to a reference 
polynucleotide, has polynucleotide bases which are identical in 90% of the bases which 
make up the reference polynucleotide and may have different bases in 10% of the bases 
which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not alter 
the amino acid sequence encoded by the polynucleotide. The present invention also relates 
to nucleotide chanses which result in amino acid substitutions, additions, deletions, fusions 
and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred 
aspect of the invention these polypeptides retain the same biological action as the 
polypeptide encoded by the reference polynucleotide. 

It is aiso appreciated that such probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries 
from the organisms listed in Table 1. For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems). Mass excisions can be 
performed on these libraries to generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the protocols/methods hereinafter 
described. 
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The excision libraries are introduced into the E. coli strain BW 14893 F'kanlA. 
Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other glycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
yield the activities as described above. 

The coding sequences for the enzymes of the present invention were identified by 
screening the genomic DNAs prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression 
clones were identified by use of high temperature filter assays using buffer Z (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4-chloro-3-indolyl-P-D- 
glucopyranoside (XGLU) (Diagnostic Chemicals Limited or Sigma") after introducing an 
excision library into the E. coli strain BW14893 F'kanlA. Expression clones encoding 
XGLUases were identified and repurified from Ml 1TL. OC1/4V, Pyococcus furiosus VC1, 
Staphylothemus marinus Fl, Thermococcus 9N-2, Thermotoga maritima MSB8, 
Thermococcus alcaliphilus AEDII12RA, and Thermococcus chitonophagus GC74. 

Z -buffer: (referenced in Miller, J.H. (1992) A Short Course in Bacterial Genetics, 

p. 445.) 

per liter: 

Na : HPCy7H : 0 16.1g 
NaH : PCv7H : 0 5.5g 
KCl 0-75 g 

MgS0 4 -7H,0 0.246g 
P-mercaptoethanol 2.7ml 
Adjust pH to 7.0 

High Temperature Filter Assav 
(1) The f factor fkan (from E. coli strain CSH1 18)(1) was introduced into the pho-pnh- 
lac-strain BW14893(2). BW13893(2). The filamentous phage library was plated 
on the resulting strain, BW14893 Fkan. (Miller, J.H. (1992) A Short Course in 
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Bacterial Genetics; Lee, K.S., Metcalf, et al. s (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. Bacterid., 174:2501-2510. 

(2) After growth on iOO mm LB plates containing 100 jig/ml ampicillim 80 jig/ml 
nethicillin and ImM IPTG, colony lifts were performed using Millipore HATF 
membrane filters. • 

(3) The colonies transferred to the filters were lysed with chloroform vapor in 150 mm 
glass petri dishes. 

(4) The filters were transferred to 100 mm glass petri dishes containing a piece of 
Whatman 3 MM filter paper saturated with buffer. 

(a) when testing for galactosidase activity (XGALase), 3 MM paper was 
saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 
Corporation). After transferring niter bearing lysed colonies to the glass 
petri dish, placed dish in oven at 80-85 °C. 

(b) when testing for glucosidase (XGLUase), 3 MM paper was saturated 
with Z buffer containing 1 mg'ml XGLU. After transferring filter bearing 
lysed colonies to the qlass petri dish, placed dish in oven at 80-85 3 C. 

(5) 'Positives' were observed as blue spots on the filter membranes. Used the following 
filter rescue technique to retrieve plasmid from lysed positive colony. Used pasteur 
pipette (or glass capillary tube) to core blue spots on the filter membrane. Placed 
the small filter disk in an Eppendorf tube containing 20 \x\ water. Incubated the 
Eppendorf tube at 75 °C for 5 minutes followed by vortexing to elute plasmid DNA 
off filter. This DNA was transformed into electrocompetent £ coli cells DH10B 
for Thermatoga maritima MSB8-6G, Staphylothermus marinus F1-12G, 
Thermococcus AEDII12RA-18B/G, Thermococcus chitonophagus GC74-22G, 
Ml 1T1 and OC1/4V. Electrocompetent BW14893 F'kanl A £ coli were used for 
Thermococcus 9N2-3 1B/G, and Pyrococcus furiosus VC1 -7G1 . Repeated filter-lift 
assay on transformation plates to identify 'positives'. Return transformation plates 
to 37°C incubator after filter lift to regenerate colonies. Inoculate 3 ml LB liquid 
containing 100 \ig/m\ ampicillin with repurified positives and incubate at 37°C 
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overnight. Isolate plasmid DNA from these cultures and sequence plasmid insert. 
In some instances where the plates used for the initial colony lifts contained non- 
confluent colonies, a specific colony corresponding to a blue spot on the filter could 
be identified on a regenerated plate and repurified directly, instead of using the filter 
rescue technique. 

Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, 105 °C 
for 20 minutes) to monitor thermostability. The 3 MM paper is saturated with different 
buffers (Le., 1 00 mM NaCl, 5 mM MgCl 2 , 1 00 mM Tris-Cl (pH 9.5)) to determine enzyme 
activity under different buffer conditions. 

A p-glucosidase assay may also be employed, wherein GlcpPNp is used as an 
artificial substrate (aryl-p-glucosidase). The increase in absorbance at 405 nmasa result 
of p-nitrophenol (pNp) liberation was followed on a Hitachi U-l 100 spectrophotometer, 
equipped with a thermostatted cuvette holder. The assays may be performed at 80 °C or 
90°C in closed 1-ml quartz cuvette. A standard reaction mixture contains 150 mM 
trisodium substrate, pH 5.0 (at 80°C), and 0.95 mM pNp derivative pNp = 0.561 mM' 1 cm' 
l ). The reaction mixture is allowed to reach the desired temperature, after which the 
reaction is started by injecting an appropriate amount of enzyme (1.06 ml final volume). 

1 U P-glucosidase activity is defined as that amount required to catalyze the 
formation of 1 .0 umol pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for p-galactosidase activity' is described by Miller, J.H. (1992) A 
Short Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular 
Genetics, the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for p-galactosidase specific activity is described 
by : Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed. K. Hardy) pp 79-103. IRL Press, Oxford. A description of the procedure can 
be found in Miller (1992) p. 75-77, the contents of which are incorporated by reference 
herein in their entirety. 
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The polynucleotides of the present invention may be in the form of DNA which 
DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double- 
stranded or single-stranded, and if single stranded may be the coding strand or non-coding 
(anti -sense) strand. The coding sequences which- encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) or 
may be a different coding sequence which coding sequence, as a result of the redundancy 
or degeneracy of the genetic code, encodes the same mature enzymes as the DNA of 
Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures 1-18 (SEQ ID 
NOS: 15-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme: the coding sequence for the mature enzyme and additional coding sequence 
such as a leader sequence or a proprotein sequence: the coding sequence for the mature 
enzvme (and optionally additional coding sequence) and non-coding sequence, such as 
introns or non-coding sequence 5' and/or 3' of the coding sequence for the mature enzyme. 

Thus, the term "polynucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
polynucleotides which encode for fagments, analogs and derivatives of the enzymes having 
the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64), The 
variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature 
enzymes as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as variants of 
such polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which 
is a naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 
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ID NOS: 1-14 and 57-60). As known in the art, an allelic variant is an alternate form of a 
polynucleotide sequence which may have a substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function of the encoded enzyme. 

Fragments of the full length gene of the present invention may be used as a 
hybridization probe for a cDNA or a genomic library to isolate the full length DNA apd to 
isolate other DNAs which have a high sequence similarity to the gene or similar biological 
activity. Probes of this type preferably have at least 10, preferably at least 15, and even 
more preferably at least 30 bases and may contain, for example, at least 50 or more bases. 
The probe may also be used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the complete gene including 
regulatory and promotor regions, exons, and introns. An example of a screen comprises 
isolating the coding region of the gene by using the known DNA sequence to synthesize an 
oligonucleotide probe. Labeled oligonucleotides having a sequence complementary to that 
of the gene of the present invention are used to screen a library of genomic DNA to 
determine which members of ihe library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
hereinabove-described sequences if there is at least 70%, preferably at least 90%, and more 
preferably at least 95% identity between the sequences. The present invention particularly 
relates to polynucleotides which hybridize under stringent conditions to the hereinabove- 
described polynucleotides. As herein used, the term "stringent conditions" means 
hybridization will occur only if there is at least 95% and preferably at least 97% identity 
between the sequences. The polynucleotides which hybridize to the hereinabove described 
polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

Alternatively, the polynucleotide may have at least 15 bases, preferably at least 30 
bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide 
of the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be employed 
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as probes for the polynucleotides of SEQ ID NOS: 1-14 and 57-60, for example, for 
recovery of the polynucleotide or as a diagnostic probe or as a PCR primer. 

Thus, the present invention is directed to polynucleotides having at least a 70% 
identity, preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS: 15-28 and 61-64 as well as 
fragments thereof, which fragments have at least 1 5 bases, preferably at least 30 bases and 
most preferably at least 50 bases, which fragments are at least 90% identical, preferably at 
least 95% identical and most preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 
and derivatives of such enzyme. 

The terms "fragment: 1 "derivative" and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially the 
same biological function or activity as such enzymes. Thus, an analog includes a proprotein 
which can be activated by cleavage of the proprotein portion to produce an active mature 
enzyme. 

The enzymes of the present invention may be a recombinant enzyme, a natural 
enzyme or a synthetic enzyme, preferably a recombinant enzyme. { 

The fragment, derivative or analog of the enzymes of Figures 1-1 8 (SEQ ID NOS: 
15-28 and 61-64) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a conserved' 
amino acid residue) and such substituted amino acid residue may or may not be one 
encoded by the genetic code, or (ii) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the mature enzyme is fused with another 
compound, such as a compound to increase the half-life of the enzyme (for example, 
polyethylene glycol), or (iv) one in which the additional amino acids are fused to the mature 
enzyme, such as a leader or secretory sequence or a sequence which is employed for 
purification of the mature, enzyme or a proprotein sequence. Such fragments, derivatives 
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and analogs are deemed to be within the scope of those skilled in the art from the teachings 
herein. 

The enzymes and polynucleotides of the present invention are preferably provided 
in an isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a 
naturally-occurring polynucleotide or enzyme present in a living animal is not isolated, but 
the same polynucleotide or enzyme, separated from some or all of the coexisting materials 
in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 
such polynucleotides or enzymes could be part of a composition, and still be isolated in that 
such vector or composition is not part of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 15-28 
and 61-64 (in particular the mature enzyme) as well as enzymes which have at least 70% 
similarity (preferably at least 70% identity) to the enzymes of SEQ ID NOS: 15-28 and 61- 
64 and more preferably at least 90% similarity (more preferably at least 90% identity) to 
the enzymes of SEQ ID NOS: 15-28 and 61-64 and still more preferably at least 95% 
similarity (still more preferably at least 95% identity) to the enzymes of SEQ ID NOS: 15- 
28 and 61-64 and also include portions of such enzymes with such portion of the enzyme 
generally containing at least 30 amino acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is determined by comparing 
the amino acid sequence and its conserved amino acid substitutes of one enzyme to the 

sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference 
polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative 
amino acid substitutions. Such substitutions are those that substitute a given'amino acid in 
a polypeptide by another amino acid of like characteristics. Typically seen as conservative 
substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
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Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic 
residues Asp and Glu, substitution between the amide residues Asn and Gln ? exchange of 
the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

Most hishly preferred are variants which retain the same biological function and 
activity as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding full-length enzyme by peptide synthesis: therefore, the 
fragments may be employed as intermediates for producing the full-length enzymes. 
Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the 
present invention, host cells which are genetically engineered with vectors of the invention 
and the production of enzymes of the invention by recombinant techniques. 

Host cells are eeneticaily engineered (transduced or transformed or transfected) with 
the vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form or a plasmid ; a viral particle, a phage, 
etc. The engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants or amplifying the genes of the 
present invention. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to the 
ordinarily skilled artisan. 

The polynucleotides of the present invention may be employed for producing 
enzymes by recombinant techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors for expressing an enzyme. Such 
vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., 
derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox virus, and pseudorabies. However, any other vector may be used as 
long as it is replicable and viable in the host. 
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The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the- art. Such procedures and others are 
deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. coli. 
lac or trg, the phage lambda P L promoter and other promoters known to control expression 
of genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also 
contains a ribosome binding site for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for amplifying expression. 

In addition, the expression vectors preferably contain one or more selectable marker 
aenes to provide a phenorypic trait for selection of transformed host ceils such as 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as 
well as an appropriate promoter or control sequence, may be employed to transform an 
appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial 
cells, such as E. coli . Streptomvces , Bacillus subtilis : fungal cells, such as yeast; insect cells 
such as Drosophila S2 and Soodoptera Sf9; animal cells such as CHO, COS or Bowes 
melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
to be within the scope of those skilled in the art from the teachings herein. 

More particularly, the present invention -also includes recombinant constructs 
comprising one or more of the sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention 
has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory sequences, including, for example, 
a promoter, operably linked to the sequence. Large numbers of suitable vectors and 
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promoters are known to those of skill in the an. and are commercially available. The 
following vectors are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 
(Qiagen), pDIO, psiX174. p3luescript II KS, pNH8A, P NHl6a, pNH18A. pNH46A 
(Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
However, any other plasmid or vector may be used as long as they are replicable and viable 
in the host. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lad, lacZ, T3, T7, gpt, lambda P R , P L and trp. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallothionein-I. Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. 

In a further embodiment, the present invention relates to host cells containing the 
above-described constructs. The host cell can be a higher eukaryotic cell, such as a 
mammalian ceil, or a lower eukaryotic ceil, such as a yeast cell, or the host cell can be a 
prokaryouc cell, such as a bacterial cell. Introduction of the construct into the host cell can 
be effected by calcium phosphate transfection. DEAE-Dextran mediated transfection, or 
electroporation (Davis, L, Dibner, M., Battey, I., Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the 
gene product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be 
employed to produce such proteins using RNAs derived from the DNA constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et al., Molecular Cloning: A Laboratory 
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Manual. Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the present invention by higher 
eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are 
cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to 
increase its transcription. Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance 
gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly- 
expressed gene to direct transcription of a downstream structural sequence. Such promoters 
can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate 
kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate phase with translation 
initiation"^ termination sequences, and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous sequence can encode a fusion 
enzyme including an N-terminal identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural 
DNA sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a functional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to ensure 
maintenance of the vector and to, if desirable, provide amplification within the host. 
Suitable prokaryotic hosts for transformation include E.coli, Bacillus subtilis , Salmonella 
tvphimurium and various species within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a matter of choice. 

As a representative but nonlimiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication denved from 
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commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM1 (Promega Biotec, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to 
an appropriate cell density, the selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any 
convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or 
use of cell iysing agents, such methods are well known to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell 
lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa 
and BHKcell lines. Mammalian expression vectors will comprise an origin of replication, 
a suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, 
and 5 f flanking nontranscribed sequences. DNA sequences derived from the SV40 splice, 
and polyadenylation sites may be used to provide the required nontranscribed genetic 
elements. 

The enzyme can be recovered and purified from recombinant cell cultures by 
methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose chromatography, hydrophobic 
interaction chromatography, affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, as necessary, in completing 
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configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally purified product, or a 
product of chemical synthetic procedures, or produced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may be 
non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

P-galactosidase hydrolyzes lactose to galactose and glucose. Accordingly, the 
OC1/4V, 9N2-31B/G, AEDII12RA-18B/G and F1-12G enzymes may be employed in the 
food processing industry for the production of low lactose content milk and for the 
production of galactose or glucose from lactose contained in whey obtained in a large 
amount as a by-product in the production of cheese. Generally, it is desired that enzymes 
used in food processing, such as the aforementioned p-gaiactosidases, be stable at elevated 
temperatures to help prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical industry. The enzymes 
are used to treat intolerance to lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable p-galactosidases also have uses in diagnostic applications, where they 
are employed as reporter molecules. 

Glucosidases act on soluble cellooiigosaccharides from the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non-reducing ends (endo-glucanases, for instance, act on internal 
linkages yielding cellobiose, glucose and cellooiigosaccharides as products). P- 
glucosidases are used in applications where glucose is the desired product. Accordingly, 
M11TL, F1-12G, GC74-22G, MSB8-6G , OC1/4V, VCI-7G1, 9N2-31B/G and 
AEDII12RA18B/G maybe employed in a wide variety of industrial applications, including 
in com wet milling for the separation of starch and gluten, in the fruit industry for 
clarification and equipment maintenance, in baking for viscosity reduction, in the textile 
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industry for the processing of blue jeans, and in the detergent industry as an additive. For 
these and other applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding to a sequence of the 
present invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. The antibody so obtained 
will then bind the enzymes itself. In this manner, even a sequence encoding only a 
fragment of the enzymes can be used to generate antibodies binding the whole native 
enzymes. Such antibodies can then be used to isolate the enzyme from cells expressing that 
enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), the tnoma technique, the 
human B-cell hybridoma technique (Kozbor et aL 1 983, Immunology Today 4:72), and the 
EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et aL 1985, in 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. Patent 
4,946.778) can be adapted to produce single chain antibodies to immunogenic enzyme 
products of this invention. Aiso, transgenic mice may b .■ used to express humanized 
antibodies to immunogenic enzyme products of this invention. 

Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology, Vol 160, pp. 87- 
1 16, which is hereby incorporated by reference in its entirety. 

The present invention will be further described with reference to the following 
examples; however, it is to be understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the following examples certain frequently 
occurring methods and/or terms will be described. 
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" "Plasmids" are designated by a lower case p preceded and/or followed by capital 
letters and/or numbers. The starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be constructed from available plasmids 
in accord with published procedures. In addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction 
enzyme that acts only at certain sequences in the DNA. The various restriction enzymes 
used herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For analytical 
purposes, typically 1 ug of plasmid or DNA fragment is used with about 2 units of enzyme 
in about 20 ul of buffer solution. For the purpose of isolating DNA fragments for plasmid 
construction, typically 5 to 50 ug of DNA are digested with 20 to 250 units of enzyme in 
a larger volume. Appropriate buffers and substrate amounts for particular resi action 
enzymes are specified by the manufacturer. Incubation times of about 1 hour at 37°C are 
ordinarily used, but may vary in accordance with the supplier's instructions. After digestion 
the reaction is electrophoresed directly on a poiyacrytamide gel to isolate the desired 
fragment. 

Size separation of the cleaved fragments is performed using 8 percent 
polyacrylamide gel described by Goeddel. D. et ai. Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
complementary polydeoxynucleotide strands which may be chemically synthesized. Such 
synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two 
double stranded nucleic acid fragments (Maniatis, T.. et al., Id., p. 146). Unless otherwise 
provided, ligation may be accomplished using known buffers and conditions with 10 units 
of T4 DNA ligase ("ligase") per 0.5 ug of approximately equimolar amounts of the DNA 
fragments to be ligated. 
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Unless otherwise stated, transformation was performed as described in the method 
of Graham, F. and Van der Eb, A., Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purification of Glvc nsida.se Enzvmes 
DNA encoding the enzymes of the present invention, SEQ ID NOS: 1 - 14 and 57-60 
were initially amplified from a pBluescript vector containing the DNA by the PCR 
technique using the primers noted herein. The amplified sequences were then inserted into 
the respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5' and 3' primer sequences for 
the respective genes are as follows: 

Thermococcus AEDI112RA -18B/G 

5' CCGAGAATTCATTA.AAGAGGAGAA,\TTA.'\CTATGGTGA.-VTGCTATGATTCTC 3' (SEQ ID NO:29) 
3' CGGAAGATCT7CATAGCTCCGGAAGCCCATA 5' (SEQ ID NO:30) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Big 
II. 

OC1/4V-33B/G 

5' CCGAGAATTCATT.^AAGAGGAGAAATT.AACTATGATAAG.AAGGTCCGATTTTCC 3' 
(SEQ IDNO:31) 

3' CGGAAGATCTTTAAGATnTAGAAA-rrCCTT 5' (SEQ ID NO:32) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
II. 

Thermococcus 9N2 - 31B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGGCTTTCTC 3' 
(SEQ IDNO:33) 

3- CGGAGGTACCTCACCCAAGTCCGAACTTCTC 5' (SEQ ID N0.34) 

Vector: pQE30; and contains the following restriction enzyme sites 5' EcoRI and 3' 
Kpnl. 
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Staphylothermus marinusFl - 12G 

5' CCGAGAATTCATTAAAGAGGAGAA.\TT/\ACTATGATAACGTTTCCTGATTAT 3' 
(SEQ !DNO:35) 

3* CGGAAGATCTTTATTCGAGGTTCTTTAATCC 5' (SEQ ID NO:36) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
II. 

Thermococcus chitonophagus GC74 - 22G 

5' CCGAGAATTCATTCATTAAAGAGGAG/XAATTAACTATGCTTCCAGGAGA.A.CTTTCTC 3' 
(SEQIDNO:37) 

3' CGGAGGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ ID NO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 
BamHI. 

M11TL 

5' AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG 3* (SEQ ID NO:39) 
3' AATAAAAGCTTACTGGATC AGTGTAAGATCCT 5' (SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5' SphI and 3' Hind 

m. 

Thermotoga maritime MSB8-6G 

5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 3* (SEQ ID NO:41 ) 
3* CGGAGGTACCTCATGGTTTGAATCTCTTCTC 5' (SEQ ID NO:42) 

Vector: pQE12; and contains the following restriction enzyme sites 5 r EcoRI and 3' 
KpnI. 

Pyrococcus furiosus VC1 - 7G1 

5* CCGACAATTGATT.V^GAGGAGAAATTAAC'rATGTTCCCTGAAAAGTTCCTT 3' (SEQ ID NO:43) 
3' CGGAGGTACCTCATCCCCTCAGCAATTCCTC 5' (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Kpn 
I. 
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Bankia gouldi endoglucanase (37GP1 ) 

5' AATAAGGATCCGTTTAGCGACGCTCGC 3' (SEQ ID NO:45) 

3' AATAAAAGCTTCCGCGTrGTACAGCGGTAATAGGC 5' (SEQ ID NO:46) 

Vector: pQE52; and contains the following restriction enzyme sites 5' Bam HI and 3' 
Hind HI. 

f 

Thermotoga maritima a-galactosidase (6GC2) 

5' tttattgaattcattaaagaggagaaattaactatgatctgtgtggaaatattcggaaag 3' 

(SEQIDNO:47) 

3' TCTATAAAGCTTTCATTCTCTCTCACCCTCTTCGTAGAAG 5- (SEQ ID NO:48) 

Vector: pQET; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 
III. 

Thermotoga maritima fl-mannanase (6GP2) 

5' TrTATrC.AATTGAn.AAAGAGGAGA.A.\TT.AACTATGGGGATTGGTGGCGACGAC 3' 
(SEQ ID NO:49) 

3' TTTATTAAGCTTATCTTTTCATATTCACATACCTCC 5' (SEQ !D NO:50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3 ! 
EcoRI. 

AEPII la B-mannanase (63GB 1) 

5- TITATrGAATrCAnAAAGAGGAGAAATrAACTATGCTACCAGAAGAGTrCCTATGCMGC 3' 
(SEQ IDN0.51) 

3- TTTATTAAGCTTCTCATCAACGGCTATGGTCTTCATTTC 5' (SEQ ID NO:52) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

OC1/4V endoglucanase (33GP1) 

5' AAAAAACAATTG.AATTCATTAAAGAGGAGAAATTAACTATGGTAGAAAGACACTTCAGATATGTTCTT 
3' (SEQIDNO:53) 

3' TTTTTCGGATCCAATTCTTCATTTACTCTTTGCCTG 5' (SEQ ID NO:54) 
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Vector: pQEt; and contains the following restriction enzyme sites 5' BamHI and 3' 
EcoRI. 

Thermotoga maritima pullalanase (6GP3) 

5' TmGGAATrCAn.AAAGAGGAGAA.mAACTATGGAACTGATCATAGAAGGTTAC 3' 
(SEQ IDNO:55) 

3' ATAAGAAGCTnTCACTCTCTGTACAGAACGTACGC 5' (SEQ ID NO:56) 

Vector: pQEt; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 

in. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on 
the bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA). The pQE vector encodes antibiotic resistance (Amp 1 ), a bacterial origin of replication 
(ori), an EPTG-regulatable promoter operator (P/O), a ribosome binding site (RJBS), a 6-His 
tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
sequences were ligated into the respective pQE vector and inserted in frame with the 
sequence encoding for the RBS. The ligation mixture was then used to transform the E. coli 
strain M15/pREP4 (Qiagen, Inc.) by electroporation. Ml 5/pREP4 contains multiple copies 
of the plasmid pREP4 ; which expresses the lad repressor and also confers kanamycin 
resistance (Kan 0 ). Transformants were identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was isolated and 
confirmed by restriction analysis. Clones containing the desired constructs were grown 
overnight (O/N) in liquid culture in LB media supplemented with both .Amp (100 ug/ml) 
and Kan (25 us/ml). The O/N culture was used to inoculate a large culture at a ratio of 
1:100 to 1:250. The cells were grown to an optical density 600 (O.D. 600 ) of between 0.4 and 
0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added to a final 
concentration of 1 mM. IPTG induces by inactivating the lad repressor, clearing the P/O 
leading to increased gene expression. Cells were grown an extra 3 to 4 hours. Cells were 
then harvested by centrifugation. 
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The primer sequences set out above may also be employed to isolate the target gene 
from the deposited material by hybridization techniques described above. 

Example 2 

Isolation of A Selected Clone From the Deposited genomic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with 32 P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et aL, 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY, 
1982). The deposited clones in the pBluescript vectors may be employed to transform 
bacterial hosts which are then piated on 1.5% agar plates to the density of 20,000- 
50,000 pfu7150 mm plate. These plates are screened using Nylon membranes according 
to the standard screening protocol (Stratagene, 1993). Specifically, the Nylon 
membrane with denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaH,P0 4 , 0.4%SDS, 5 x Denhardt's 500 ng/ml denatured, sonicated salmon sperm 
DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC, 20 mM NaH 2 P0 4 , 0.4%SDS. 500 ug/ml 
denatured, sonicated salmon sperm DNA with 1x1 0 6 cpm/ml 32 P-probe overnight at 
42°C. The membrane is washed at 45-50°C with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film overnight. Positive clones are 
isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide primers corresponding to the 
gene of interest are used to amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 |il of reaction mixture with 0.5 ug of the DNA of the 
gene of interest. The reaction mixture is 1.5-5 mM MgCU 0.01% (w/v) gelatin, 20 ^M 
each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq 
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polymerase. Thirty five cycles of PCR (denaturation at 94 °C for 1 min; annealing at 
55°C for 1 min; elongation at 72°C for 1 min) are performed with the Perkin-Elmer 
Cetus automated thermal cycler. The amplified product is analyzed by agarose gel 
electrophoresis and the DNA band with expected molecular weight is excised and 
purified. The PCR product is verified to be the gene of interest by subcloning and 
sequencing the DNA product. The ends of the newly purified genes are nucleotide 
sequenced to identify full length sequences. Complete sequencing of full length genes is 
then performed by Exonuclease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activity 

Screening procedures for a-galactosidase protein activity may be assayed for as 
follows; 

Substrate plates were provided by a standard plating procedure. Dilute XL1 - 
Blue MRP E coli host of (Stratagene Cloning Systems, La Jolla. CA) to O.D. 600 = 1 .0 
with NZY media. In 15 ml tubes, inoculate 200 y\ diluted host cells with phage. Mix 
gently and incubate tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) containing ImM IPTG to each tube and pour onto all NYZ plate surface. 
Allow to cool and incubate at 37 °C overnight. The assay plates are obtained as 
substrate p-Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 mM NaCl, 100 
mM Potassium-Phosphate) 1% (w/v) agarose. The plaques are overlayed with 
nitrocellulose and incubated at 4 °C for 30 minutes whereupon the nitrocellulose is 
removed and overlayed onto the substrate plates. The substrate plates are then incubated 
at 70 °Cfor20 minutes. 
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Example 4 

Screening of Clones for Ma nnanase Activity 
A solid phase screening assay was utilized as a primary screening method to test 

clones for B-mannanase activity. 

A culture solution of the Y1090-£. coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D. 600 =1 .0 with NZY media. The amplified library from 
Thermotoga mahtima lambda gtll library- was diluted in SM (phage dilution buffer): 5 
x 10 7 pfu/ul diluted 1:1000 then 1:100 to 5 x 10 : pfu/ul. Then 8 ul of phage dilution 
(5 x 10 2 pfu/ul) was plated in 200 ul host ceils. They were then incubated in 1 5 ml 

tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 m.M IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems. La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 

removed and stored at 4 °C. 

An Azo-galactomannan overlay was applied to the LB plates containing the 
lambda plaques. The overlay contains 1% agarose. 50 mM potassium-phosphate buffer 
pH 7, 0.4% Azocarob-galactomannan. (Megazyme, Australia). The plates were 
incubated at 72 =C. The Azocarob-galactomannan treated plates were observed after 4 
hours then returned to incubation overnight. Putative positives were identified by 
clearing zones on the Azocarob-galactomannan plates. Two positive clones were 
observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out~portions by soaking the individual portions in 500 ul SM (phage dilution buffer) 
and 25 ul CHC1 3 . 
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Example 5 

Screening of Clones for Mannosidase Activity- 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannosidase activity. 

A culture solution of the Y1090-£. coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D. 600 =1 .0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage dilution buffer): 5 x 10 7 pfii/ul 
diluted 1:1000 then 1:100 to 5 x 10 : pfu/|il. Then M of phage dilution 
(5 x 10 2 pfuyjil) was plated in 200 \x\ host cells. They were then incubated in 15 ml 
tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five aours. The plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 

removed and stored at 4 °C. 

A p-nitrophenyl-B-D-manno-pyranoside overlay was applied to the LB plates 
containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-B-D-manno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 °C. The p-nitrophenyl-B-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl-B- 
D-manno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 jil SM (phage dilution buffer) 
and 25 \A CHCl 3 . 
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Example 6 
Screening for Pullulanase Activity 

Screening procedures for pullulanase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Host cells are 
diluted to O.D. 600 = 1.0 with NZY or appropriate media. In 15 ml tubes, inoculate 200 
ix\ diluted host cells with phage. Mix gently and incubate tubes at 37 °C for 15 min. 
Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the mixture 
is plated, allowed to cool, and incubated at 37°C for about 28 hours. Overlays of 4.5 
mis of the following substrate are poured: 

100 ml total volume 



0.5g Red Pulluian Red (Megazyme. Australia) 

l.Og Agarose 

5ml Buffer (Tris-HCL pH 7.2 @ 75 °C) 

2ml SMNaCl 

5ml CaCl : (100mM) 

85ml dH : 0 



Plates are cooled at room temperature, and thenm incubated at 75 °C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 
Screening for Endoglucanase Activity 

Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

I . The gene library is plated onto 6 LB/GelRite/0.1% CMC/NZY agar plates 
(~4 r 800 plaque forming units/plate) in Exoli host with LB agarose as top agarose. The 
plates are incubated at 37°C overnight. 
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2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes (Stratagene) at room 
temperature for one hour and the membranes are oriented and lifted off the plates and 
stored at4°C. 

4. The top agarose layer is removed and plates are incubated at 37° C for -3 

hours. 

5. The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 15 minutes. 

7. The plate is destained with 1 M NaCl. 

8. The putative positives identified on plate are isolated from the Duralon 
membrane (positives are identified by clearing zones around clones). The phage is 
eluted from the membrane by incubating in 500 ul SM + 25 ul CHC1 3 to elute. 

9. Insert DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1 ml overnight miniprep of clone at maximum speed for 3 

minutes. 

ii) Decant the supernatant and use it to fill "wells" that have been 
made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37°C for 2 hours. 

iv) Stain with 0.1% Congo Red for 1 5 minutes. 

v) Destain with 1 M NaCl for 1 5 minutes. 

vi) Identify positives by clearing zone around clone. 

Numerous modifications and variations of the present invention are possible in 
light of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQIDNOS: 1-14 and 57-60; 

(b) ' SEQ ID NOS: 1-14 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences complementary to SEQ ID NOS : 1 - 1 4 and 5 7- 
60; 

(d) polynucleotide sequences which encode an amino acid sequence as set 
forth in SEQ ID NOS: 15-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 1 5 consecutive bases in 
length and that will selectively hybridize to DNA which encodes a 
polypeptide of SEQ ID NOS:15-28 s and 61-64. 

2. A vector comprising a polynucleotide of claim 1 . 

3. A host cell containing the vector of claim 2. 

4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3 : wherein the host cell is a prokaryotic cell. 

6. A method for producing a polypeptide comprising: 

(a) culturing the host cells of claim 3; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by said 
polynucleotide; and 

(c) isolating the polypeptide. 
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enzyme selected from the group consisting of: 

an enzyme comprising an amino acid sequence set forth in SEQ ID NOS: 
15-28 or 61-64; and 

an enzyme which comprises at hast 30 consecutive amino acid residue as 
an enzyme of (a). 

8. An enzyme of which at least a portion is coded for by a polynucleotide of 
claim 1, and which is selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence which is at least 70% 
identical to an amino acid sequence selected from the group of amino 
acid sequences set forth in SEQ ID NOS:15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 

9. A method for generating glucose from solubie cell oligosaccharides comprising 
contacting a sample containing oligosaccharides with an effective amount of an 
enyzme selected from the group consisting of an enzyme having the amino acid 
sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 such that glucose is 
produced. 

10. The method of cliam 9, wherein the sample is selected from the group consisting 
of dairy products, fruit juices, detergents, textiles, guar gum, animal feed, plant 

biomass and waste products. 

11. The method of claim 9, wherein the oligosaccharide is selected from the group 
consisting of maltose, cellobiose, lactose, sucrose, raffinose, stachyose, 
verbascose, cellulose, starch, amylose, glycogen, disacharrides, polysacharrides 
and pullulan. 
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Figure 6 
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THEJLMOCOCCUS CHITONOPHAGDS CLYC0SIDA3E - 22C 
COMPLETE SEQUENCE - 9/95 

J Z Z To C C ;= £ ™ " C ™ «! "7 I" « T « «« «C CAC TTT CAA ATC ,CC 
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81 CAC ACA CTC ACG AM CAC ATT CAT CCA AAC ACA CAT TCC TCC TAC TCC CT» ,ri r., n 
n ^ A„ u. Ar 9 A, B „, Il0 Asp Pro A „ Thr Ajp Trp ™ £ £ ™ £ «T CM 

121 TAT AAT ATC AAA AAA CCA CTA CTA ACT CCC CAT CTT CCC CAA rxr rr~r a-. 
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421 ACA AAC ACG CCT TTT AAC CTA ATA CTA AAr era ii-p ... — . „ 

- a,, cly PhQ Lys ^ s s - ™ £ ™ « « « « - - - 

<U CAT GAT CCT ATC CAA TCT ACA CAA AAA CCC CTC ACC AAT AAC ACA AAC CCA TCC C~A A- 
H,s Ajjp Pro Ilo clu Sor Arg Clu Lys Ala Leu Thr Asn Ly, a£ £n Ty ™ ™ £ 

!■! ^ AW ^ ^ ATA GAG TTT CCA AAA TTT CCC CCC TAT TTA CCA TAT AAA -C GCA CAC 
,8! Clu Ax, Sor vol Ho Clu Pho Ala Lyo Phe Al, Ala Tyr Lou Ala Tyr Lys Pho^ £ £ P 

201 ^ ^ C ^ ACC ^ TTT AAT CAA CCT ATC CTC CTC CCC GAG TTC GGC 'A- TTA 

201 no val Asp Ho z Trp Ser Thr Pho Asn Clu Pro Hec Val Val Ala Clu Leu Gly Tyr lTu 

561 CCC CCA TAC TCA CCA TTC. CCC CCC CCA CTC ATC AAT CCA CAA CCA CCA AAC TTA G- A^ 
22. Ala Pro Tyr Ser Cly Pho Pro Pro Cly Val Het Asn Pro Clu Ala l£ £ ™ EI Mel 

721 CTA CAT ATC ATA AAC CCC CAT CCT TTA CCA TAT ACG ATC ATA AAC AAA TT CAC AGA AAA 
241 Hio Hot Ilo Asn Ala Hi, Ala Lou AU Tyr Ax, Hoc II. Lys Lys Pho Asp^ Arg Lys 

III ry* G J1 "I f A CCA CCT CAA ATA GCA ATT ATA TAC AAT AAC ATC CCC 840 

261 Lys Ala Asp Pro Clu Ser lys Glu Pro Ala Clu Ilo Cly Ho II, Tyr Asn Asn Ilo Cly 

lt\ ^ CCC AAT ^ AAA CAC TCA AAC CAT CTA CAA CCA TCC CAT AAT CCC AAT 

281 Val Thr Tyr Pro Pho Asn Pro Ly 0 A»p S or Ly 8 Asp L*u Cln Ala Ser A^p £n Ala ^ 

"I Pho Z £1 ^ ^ TTA ACC GCT ATC ^ C AGG ^ AAA TTA AAT ATC CAA TTT 

301 Pho Pho Hxs Sor Cly Lou Pho Lou Thr Ala lie His Ar 3 Cly Ly» Leu Asn Ilo Clu Phe 
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961 GA.C GCA GAG ACA TTT CTT TAC CTT CCA TAT TTA AAC CGC AAT CAT TOG CTC GCA CTC AAT 1020 
321 Asp Cly Clu Thr Pho Vol Tyr Lou Pro Tyr Lou Ly, Cly A,n Asp Trp Leu Cly Val Asn 340 

£r £r £ £ c" SIT 71 ™ r ^ WT CCC ATG m CCA AGT ™ OT ™ ATA 1080 
Tyr Tyr Thr Arg Clu Val Val Lys Tyr Cln Asp Pro Met Pho Pro Sor lie Pro Leu lie 360 

1081 ACC TTC AAC CCC CTT CCA CAT TAT CCA TAC CCA TCT AGA CCA CCA ACC ACC TCA AAC CAC mo 
361 ser Phe Lys Cly Vol Pro Asp Tyr Cly Tyr Cly Cyo Ar, Pro C^y rll Thr ler £ Asp 

'ill %i n c CT gac A r ^ tcg cac cta tat ccc ^ ° cc ™ ™ ^ ™ ™ 

381 Cly Asn Pro Val Ser Asp lie cly Trp Clu Val Tyr Pro Lys Cly Met Tyr Asp Ser He 

"S! SI ^ ^ ^ ^ ^ TAC CT * ACA CAA ^ C «A CCA CAT TCA 

401 Val Ala AU Asn Clu Tyr Cly Val Pro Val Tyr Vai Thr Clu Asn Cly IU Ala Asp Ser 

l 42 6 I £ Asl Sit 7 A C P CC ^ ? TAC ATC GCA TCT CAC ^ ™ W= ATC CAA CAC CCT TAC 

421 Lys Asp Val Leu Aro Pro Tyr Tyr He Ala Ser Hi, lie Clu Ala Met Clu Clu Ala Tyr 
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Figure la. 
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Figure 7b (Continued) 
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Figure 8cu 
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^gure 8b(Continued) 
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IIS n* ?* ™ ^ ^ ^ 000 ^ TCC CCA CCG CTC AGC CCA CTC ACC 
Ar0 116 ^ Uu Ala Leu Cys Ala Ala z*u Ser Pro Val £r 

63 72 fl i 90 go 

TTT CCA GAT AAT CTA ACC CTA CAA ATC GAC GCC GAC CCC GGT ^ 
IH. Ala Asp A*n Val ^ Vol Oln «. Aap £ £ 5 £ £ £ 2 £ 

117 125 ^ 

ACC CGA GCC CT? TAC GCC ATG AAT AAC TCC AAC CCA CAA xrir r*!*l 162 

* 7 * 180 189 ioQ 

273 288 287 306 3ib 

TAC AAC AAT CTC TAC CCC CCC AAC AAC AAC TOG GAC AAC CG3 CTA GCC CX3 A""^ 
Tvr A*n a- Val Tyr Ala Gly Asn A,n Asn Trp A, P Aan Ary S E ^ £ 

333 342 351 lfio ,,. 

-in Glu A*n Leu Pro Gly AI a Asp Thr Hot Trp Ala Phe Gin Leu Ila Gly Lys 

387 35fi ^q 5 

441 450 459 463 

Tv-n Th^ CCT CAG AAT CTC GCT GGC CCC GOT GAA CCC AAT CTG GAC 

Trp rrp Thr Gly Vol Ala Gin A*n Leu Ala Gly Gly Gly Glu Pro S™S 
«55 504 513 532 

545 S5fl S " 57S sac 

^ f* ^ if ^ ATT « ^ »C TOG TTT GCC OTA AAC GCC £ 

Ser Pro Ala Mp Thr Val Gly He Leu Aap His Trp Phe Gly JS 5 S 

603 612 621 Sao ' fi ,n 

S S 2° f ? °f ^ *** ™ C 100 AOT "° °« AAC GAG CCC GCC A~C 

Cly vu ^ ^ Gly Lys Ala Lys ^ Trp ^ ^ ^ CC. GCC ATC 



657 666 fi75 sa4 eo , 

TCC CTT CCC ACC CAC GAC GAT OTA GTC AAA GAA CAA AM err- r~l ? " 
Trp V,l Gly Thr „ io A Bp Val Val Lyo Glu Gin J£ £ SI Glu £ Z 

Figure 9cl 
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CTG CAC ACC TAT TTC CAA ACC GCC AAA AAA CCC CGC OCC AAA HZ r~ " 6 
U« „ ls Thr Tyr *. cxu Thr Ma Lvs AU S S £ £ £ £ £ 

76S 774 783 na 

AAA ATC ACC OCT CCC CTG CCC CCT AAT rar ^ 'f^ 801 810 

». », «, ,„ « - e 2 2 5 S S ^ 2 S £ £ 

815 83B B37 <Mc 

TTC ICS GTA CCC CAC GAA CAA OCQ TT ATG Arv* •wL' " 5 854 

x. s« v ai «. «. wu „. - » « « » « « ~ ™ « « 

873 883 asi QDfl 

CGC GTG TCT GAA GAG CAA CCC era irn rr* ~~ „° 903 918 

CTQ CAC TAC TAC CCC CCT TAC AAT CCO GAA GAT ATC GTS CAA tt« " 2 
Hi ° ^ P " ^ Asn Ala £ £ £ 51? Gin "J £J £ 

381 ASA lftAO 

.acq we nc gac ccsc gac rr irr rex nr K _ 1017 1036 
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Figure 9b(Continued) 
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■•ok la gouldl •a4ogl«ca»»fe C370P1) (continued) 

1413 1422 1««0 1449 liSfl 

ACC CAC ACC GCC ACT GTC GCT ATC GAC GAT TTC CCA CTG GAT GGC CCC TAC COC 
Thr Hi, Thr Ala Thr Val Ala lie A,p Aap Phe Pro Uu S S 2£ 

1467 1476 1494 1503 .... 

ACC CTG CCC TTA CAC AAC CTG CCC OGG GAG CAA ACC TTC CTA TCT CAC CGA GAC 
Thr Leu Arg Leu Hxa Aan Leu Pro Gly Glu Glu Thr Phe Val Ser Hia Arg Aap 

1521 1330 "39 1548 15 S7 i 5ss 

AAC GCC CTG CAA AAA GCT ACA CTG CGC GCC AGC CAC AAT ACG CTA ACA CTc'cAC 
Asa Ala Lau Glu Ly 9 Cly Thr VI Arg Ala Ser Asp Aan Thr v.l Thr Leu Glu 

1575 1584 1593 1«02 isil 

TTG CCC CCT CTG TCC GTT ACT GCA ATA TT6 CTC AAC GCC CGG CCC TAA 3 ' 
Leu Pro Pro Leu Ser Val Thr Ala lie Leu Leu Ly. Ala Arg Pro 



Figure 9a (Continued) 
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5- w ^^^^A;^. P cc^^ACCT rc Arir^ C0A ^ TICCTTC ^ 
va n. cy. v a i clM rle Pha Cly Ly , ^ ^ - -» - ; ^ --- ... 
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Lys ciu Lya Ph0 ^ VaJ clu ph , A1 ; - - I; ~- --- ... _ 



Lys lie ser cly Ar* v a I £ a~ ^ £ ™ ™ ™ ~" — ~ — 

Lys Ala Pro Olu Ly, Val Va'I A^n" A^ i£ CU Tr^ Gly ^ ^ ^ 

Val Val Asp Ala Phe s£ ^ lyl Pro" Pro" clu £ Asp Pro Asn Trp Arg Tyr 

Thr Ala Ser Val Val Pro Asp Val Clu A^ Aan Leu Cln Ser Asp Tyr phe 

Val Ala Clu Clu Gly Lys Val Tyr Cly Phi Ser Ser Ly^ Xle" Ala His 

337 396 405 414 jn-i 

^ITf^^^^I^f^OT^^^CntrCAAStriCGATG^ 
Phe Phe Ala Val Clu Asp Cly Clu Leu vll All ^ clu" T^r Phi A^p" vll 

441 450 ^gg 

Clu Phe A.,p Asp Phe Val Pro Leu Clu Pro Uu Val Val Clu Asp Pro to 

~ !!! ?H flT ™ ^ ^ ^ "** 05 °^ «i AAC AAC « 

Thr Pro Leu l-u Leu Clu Lys Tyr Ala clu III vll Cly iilt all Asn A^ Ala 

?T!* ^ ^ ?^ rcc ACT ^ TO TO ^ ^ TAC »T TAC TTC 
Art, Val Pro Lyu m„ -»ir V:o «,« Gly •rrp' ^ Tip Tyr ^ ^ H,e ^ 



Figure 10c_ 
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-Jhvtmotoya mariLima Alpho-tjdJociosidaae 

^ ™ *^!?? ^ crc **° wc 01X5 ** cic ore aat rrc 

Aap L«u thr rrp Clu Clu Thr U,u Lys uj £ tej Ja* Lys ^ j£ ^ 

657 figfi £75 

^Tf ITf f^? f^f ^ff TAC ^ ATA CGT GAC TGG CTC 

Pho Clu Val Phe Gin He A*p Asp Ail T^r clu £ Asp lie Gly Asp Trp Leu 

711 720 729 738 747 

OT3 ACA AGA GGA GAC TTT CCA TCG CTC GAA GAG ATC GCA AAA OTT ATA OCC GAA 

Val Tfcr Arg Gly Asp Phe Pro Ser Val Clu Clu Met Ail Lys Val lie Ala clu 

Asn Gly Phe He Pro Gly lie Trp Thr Ala Pro Pha Ser Val Ser Glu Thr" 

819 828 837 846 855 * B64 

^f^™WC^CATCC3C^TOCTI*C^ 

A*P Val Pfae Asn Glu His Pro Asp Trp Val Val Lys Glu to cl^ Glu J*o Lys 

~ ^ Oil 882 831 300 509 518 

*^AA GAT 

Met Ala Tyr Arg Asn Trp Asn Lys Lyn lie Tyr Ala Leu Asp Leu Ser Lys Aep 

327 936 945 954 563 ' 

CAG GTT CTG AAC TGG CTT TTC GAT CTC TTC TCA TCT CTG AGA AAG ATG OGC TAC 

Glu Val Leu Asn Trp Leu Phe Asp Leu Phe Ser Ser Leu Arr. Lya- Met Gly Tyr 

*~ ^ !Ii 990 995 1008 1026 

AGG T3^C TTC AAC ATC GAC TTT CTC TTC GOG QGT GCC CTT CCA GGA GAA AGA AAA 

Arg Tyr Phe Lys lie Asp Phe Leu Phe Ala Gly Ala Val Pro Gly Glu Arg Lya 

1035 1044 1053 1062 1071 10B0 

AAG AAC ATA ACA CCA ATT CAG GCC TTC AGA AAA CGG ATT GAG ACG ATC AGA AAA 

Lys Asn Ho Thr Pro He Gin Ala Phe Arg Lys Gly lie Glu Thr lie Arg Lye 

n~ 1090 1107 HI 6 1125 1134 

ra<pcc^GAAC*TTCTTCATCCTCC^TCr(XC 

Ala Val Gly Glu Asp See Phe lie Leu Gly Cya Gly Sex Pro Leu Leu Pro All 

U43 HS2 1161 1170 1179 U88 

CTC GGA TCC CTC GAC GCC ATG AGO ATA GGA CCT GAC ACT GCG CCG TTC TGG GGA 

Val Gly cys Vnl Asp Gly Koc Arg Ho Gly Pro Asp Thr Alu Pro Phe Trp Gly 

Figure 10^( Continued) 
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cxyn marilijna Alpha -tj ft Irtciosidauo 

CAA CAT ATA GAA CAC AAC r/lA OCT OCX CCT CCA ACA 'IOC GCG CTC A^ AAC CCC 

Glu His lie Clu Asp Asn Cly AU Pro Ala Ail Arg Trp Ala L®u 

^251 1260 1269 127ft 1-50-7 

ATA ACG ACC TO ATC CAC CAC ACG TO TOC CTC AAC GAC CCC QC TOT CTG 

He Thr Arg iyr Pha Mot His" Atp Arg Phe Trp Leu Aon Asp i>ro ~"~ 



Arg Asa Ala 
1-236 



Asp Cys Leu 
1305 i3 l< L323 1132 

He L«u Arg Glu Clu Ly S Thr Asp Leu Thr Gin Lys Glu Ly^ Glu" L^u" iyr ^ 

1368 1377 nar 

5Vr Thr cys Cly VH Uu Asp ^ £ £ clu Ser Asp Asp Leu Ser £eu 

VliArsAspKis cl y L ^ v <^ t« i»" clu fe^'ciuLeu^^ ^; 

Aa cca'cS err caa^ at aw cac a, 1 ^ ^ ™ 1503 

™ ATC 11:0 cac cat era aga tac gag atc ctc tcg 

Arg Pro ax? Val Gl* L~n lie" ^ S« Glu Asp Leu Arg lyr Clu J£ vll ^ 



1521 1530 



rcr ccc act ore ^ ^ ctc^ ^ crc^ gat ctc ^ acc aca^ 
Ser Gly Thr I*u kr Cly Asn vJ Ly^ iie v»I vll .~v '".v. ^T. ~Z ^ 

^ ~ 2 ~ ^ « ^« « AAA ACA^OTC CTC AAA^ 

TVr His Clu Ly S Clu Gly Lys S« S« Uu ~l£ a^ vll vll ^ 

~ ~ ^ if? ii?™ ™ ™ &VA 06 «° TCA 3 • 

Clu Asp Gly Arg Asn Phe Tyr Ph* clu Clu Gly Clu Ar^ GlG 



Figure 10c (Continued) 
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9 18 27 36 45 54 

5 . ATG GGG ATT GGT GGC GAC GAC TCC TCG AGC CCG TCA CTA TCG GCG GAA TTC CTT 

Mec Gly lie Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Leu 

63 72 81 50 99 108 

TTA TTG ATC GTT GAG CTC TCT TTC GTT CTC TTT GCA ACT GAC GAG TTC CTG AAA 

Leu Leu 1X0 Val Glu Lou Ser Phe Val Leu Phe Ala Ser A*p Glu Phe Val Lys 

117 126 135 144 153 162 

GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA AAA GAA TTC AGA TTC ATT GGA AGC 

Val Glu Asn Gly Lys Phe Ala Leu Asn Gly Lys Glu Phe Arg Phe lie Gly Ser 

171 180 189 198 207 216 

AAC AAC TAC TAC ATG CAC TAC AAG AGC AAC GGA ATG ATA GAC ACT GTT CTG GAG 

ten pin Tyr Tyr Met His Tyr Lys Ser Asn Gly Mat lie Asp Ser Val Leu Glu 

225 . 234 243 252 251 270 

AGT GCC AGA GAC ATG GGT ATA AAG GTC CTC AGA ATC TGG GGT TTC CTC GAC GGC 

Ser All Arg Asp Met Gly lie Lys Val Leu Arg He. Trp Gly Pha Leu Asp Gly 

279 28B 297 306 315 324 

GAG AGT TAC TGC AGA GAC AAG AAC ACC TAC ATG CAT CCT GAG CCC GGT GTT TTC 

Glu III TVr C^s Arg Asp Ly* Aon Thr Tyr Met His Pro Glu Pro Gly Val Phe 

333 342 351 360 369 37B 

GGG GTG CCA GAA GGA ATA TCG AAC GCC CAG AGC GGT TTC GAA AGA CTC GAC TAC 

Gly Val Pro Glu Gly Ho Ser Asn Ala Gin Ser Gly Pha Glu Arg Leu Asp Tyr 

3B7 396 405 414 423 432 

ACA GTT GCG AAA GCG AAA GAA CTC GGT ATA AAA CTT GTC ATT GTT CTT GTG AAC 

Thr Val All Lys Ala Lys Glu Leu Gly lie Lys Leu Val He Val Leu Val Asn 

m 450 459 468 477 486 

AAC TGG GAC GAC TTC GGT GGA ATG AAC CAG TAC GTG AGG TGG TTT GGA GGA ACC 

A*n Tr^ Asp Asp Phe Gly Gly Met Asn Gin Tyr Vol Arg Trp Phe Gly Gl : Thr 

495 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC ACA GAT GAG AAG ATC AAA GAA GAG TAC AAA AAG TAC 

His His Isp Asp Phe Tyr Arg Asp Glu Lys lie Lys Glu Glu Tyr Lys Lys Tyr 

Figure lie*- 
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ThoraotOBo najritina P-nanaaaooo (BCSC*- (eoatiauod) (6 C 'fX 
549 558 567 S76 :B5 594 

?I? If? ?I! GTA CAT CTC AAT ACC TAC acg cct tac agg gaI" 

Val Ser Phe Leu Val Asn Hia Val Asn Thr Tyr Thr Cly Val Pro Tyr 



Arg Glu 

603 612 "1 630 639 «„„ 

GAG CCC ACC ATC ATC CCC TGG GAG CTT GCA AAC GAA CCG CCC TGT GAG ACG oil 

Glu Pro Thr He Met Ala Trp Glu Leu Ala As'n Olu Pro A^g c^s Glu Thr A^p 

657 666 675 «84 603 

™ Iff fff ™ *f! CTTGMKCCn ATG *CC TCC TAC ATA fll 

Lys Ser Cly Aaa Thr Lou Val Glu Trp Val Lys ciu Met Ser Ser ^ nl " ly l 

711 720 7 29 738 747 ... 

ACT CTC GAT CCC AAC CAC CTC GTG CTC GAC TTC TTC AGC AAC 

Ser Leu Asp Pro Asn Hia Leu Val Ala Val Gly Asp Glu Gly Ihl Ihl sir 

765 774 783 792 801 3 

f-f ^ ™ Hf ^ «** ^ ^ ^ ™= AAC GGC TGG 

Tyr Glu Gly Phe Ly 8 Pro Tyr Gly Gly Glu Ala Glu Trp" Ala T^ Asn Gly T^p" 

819 828 837 846 855 854 

Iff ffl fl! " C *** m ™ * TA ^ ACG W> OAC TTC GCC ACG 

Ser Gly Val Asp Trp Lya Lys Leu Leu Ser lie Glu Thr Val Z'p Phe Gly Thr 

873 88 2 891 900 909 91a 

TTC CAC CTC TAT CCG TCC CAC TGG GGT GTC ACT CCA GAG AAC TAT GCC CAG TCC 

Phe His Lou Tyr Pro Ser His Trp Gly Val Ser Pro Glu Asa T^ Ala Gin Trp 

927 936 945 954 963 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA CCC 

Gly Ala Lys Trp lie Glu Asp His lie Lys He Ala Lys Glu lie Gly lyl 7*1 

981 990 "9 1008 1017 102 S 

CTT GTT CTG CAA GAA TAT GCA ATT CCA AAG ACT GCG CCA GTT AAC AGA ACG GCC 

Val val Leu Glu Glu Tyr Gly lie Pro Lys Ser Ala Pro Val Zn Z'g Thr AU 

1035 1044 "53 1062 1071 1080 

ATC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT GGA GCG ATG 

lie Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp Gly All Met 

Figure lib (Continued) 
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Thereto** *arUioa ^n^aao ( ^ (contlauod) ^ 
1089 1098 no? 

™ ~ .!! ™ !f? ATC 000 <** « *» «* AGA cac 1 ^ AGA 

phe Trp Mec ^ u oly u . Gly Glu --; - ~- ~ --- --. .__ 

1143 li52 1161 inn 

TAT CCG CAC TAC ^ OCT TTC AGA ATA GTC AAC^C GAC AGT^CCA GAA GCC^ 

TVT Pro As, rvr Asp Gly Phe Ar g n, vll' Z~ n £ Z' P s~ £ ~u £ ^ 

11*7 1206 1215 

CTG ATA A CA GAA TAC GCG AAG CTO TTC AAC ACA GGT GAA CVC^A AGA ^ 
L-u lie oiu Tyr Al a ^ Leu Ph . ~ ~ ~ ~ ~ £ £ --- --- 

1251 1260 1269 ,,, fl 

ACC TCC TCT TTC ATC CTT CCA AAA CAC CCC ATC AAA^AAG ACC ^ 

Thr cy* ser Phe lie Leu Pro Lys Asp cly ~ ~ ~ ~ ~ ~ -« 

1305 1314- 1323 n „ 

~ -~ ~ tT. ^ ?AC A<X " C *« *S ^ AAG^TTG TCT CTC^ 

V.1 Arc Ala Cly Val Phe A,p Z sZ Asn £ £ Glu £ £ ~ ~ ~ 

1359 1368 1377 13Sfi 

~ ™ "I !?! ^ ™ ™ <*= * TA " 3 « gga"I c ggx att™ 

Val Glu Asp Lou Val P he Glu ^ ~ £ "~ ~ — "~ ~ ~ ~- 

1413 1422 143! nan 

GGC TTT CAT CTC C.C ACA ACC CGG ATC CCG GAT CGA GAA CAT GAA A TG TTC^ 

Gly Phe A*p l« Thr Thr Arg HI III Z' P Gly «~ £ ^ H« ^ £ 

1467 1476 1485 140/ 

Glu Cly His Phe Cln oiy Ly. Thr Val Ly, s" £ £ ~ ~ ~ ~" 
1521 1530 i 53g 

~ ™ !^ ™ ™« «* CAG «* SJ rrr tcc^ oca caa^g 

A*n Glu ax. Axg Ty r Val U,u Alo Glu ~ ~" ~ ~ ~ ~ ;» ~- --- 

1575 1584 1593 lficn 

~ ^ ?f A !! !! A ACC ro ™ « «» "c 1 ^ tca cct'gac 

VX LX3 A B n Trp Trp Asn Ser Gl^ ^ £ ^ ^ ^ ^ ~ - - ™ 

Figure 1 10( Continued) 
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»o».f B o ooritioa P-nonn^aoo flfe^ (coatlauod) ^ £ f ^ 



1629 "38 1647 iS56 



Arr oaa we mc cac era goa ^ cca gca ch/Iac ctc aaa'S 

Ha Olu Trp a™ Ci y 01u Val ^ . ~ ~ ~ - - — ~- --- --- --- 

CCC CCA^ ACC CAc'tcI CAA GAA^ ACA GTA^GCA ACC AAG^TTC CAA AGA^CTC 
Pro Ciy Lys Ser Trp Gltf Clu Vfl -" ^ - - -- ; »- ... -.. ... 

1737 i74fi 175 - 

^ ^ !!! ~ *!? ™ ™ !« ~ - ** 2} CCA AAC^GTC GAG GGA ^ 

S«r Olu cy a Glu n. Leu Olu Tyr £ il'e' Tyr £ p"ro" £ va\* £ u ~ ~ 
1791 1B00 1809 ifiio 

~ ^ ^ ™ !f? « ! AC «» «™ ™ occ reo 1 ™ aag ata^ 

Lys Ciy Arg Leu Arg Pro Tyx Ala" £ £ £ £ ~ ~ ~ ~ -~ 

1845 1854 1863 i *75 

« CAC ATC AAC AAC CCC AAC C- CAA ACT OCS*S ATC ATC ACT TTC CCC^ 

Asn Val Glu Ser AU Glu He He Thx Phe ciy ciy 



Leu A_sp Met Aan Aan Ala 



1899 1908 1917 1326 



^ ^ !f ^ ^ ™ « «* ™ GAC ACA CCC CCC 1 ^ 

Lys Glu Tyr Arg Arg Phe Hi B Val" Arg il'e «"« P "hc L'p Arg Al"a Gl~y vll 

1353 1962 1971 .,„- , oob 

AAA CAA CTT CAC ATA CCA CTT CTC GOT CAT CAT CTG ACC TAC GAT CCA CCc"' 8 



ATT 



Ly» Clu ^ His He Gly Val Val ciy Asp Hi, Leu Arg Tyr L" P G* £ £ 

H? .!! !?! "I !!? 2 ?!? ^! ™ 2 *" *°* *» a £I oot A TO 2 T=i 3 . 

Phe lie Aep Aan Val Arg Leu Tyr Lye Irg Thr Gly Gly Met 



Figure lid (Continued) 
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AEPii i 0 p-naaaooidooo (eioui) 
9 18 27 



He C uu Pro Clu „. Leu Trp Qly - ~ »- --- --- --- ... ... ... 

«y ^ Arg lle - ;;; - - ^ ~ ~ ... ... 

- ^ ^ Pro Ph8 Aoa ao Lyo Lys - - - - ^ - --- --. ... 

*«. c iy n e ^ ^ ^ au ^ u Ph9 --- ~ ~ -- : ~ ~- ~- --- --. 



225 234 243 ,e, 

Gly ^ Aim Ala ^ ^ ng G -; ~ ;» ~ ~ -~ --- ... ... 

379 288 297 

Pro ^ Thr val tap clu val - ™ ~ - ~- ~ --- --- 

*i ^ „. ^ Lya sar ^ ^ Ala £ ~; - - - - - 

387 356 405 a 

=M B» «. « « ra « „ „ M J[i m ^ £ m ^ «2 

«» «« « ^ ^ «, v .; ™ ~ ~ ~ ~. ... ... 



*. V.X P.. v.! L _ H1 . ~ = ~ ~ ~ ~ ~- --- --- ... 

... v., M . ^ „„ ty ..„. t „ ^ - - - ~ - s -- --■ 

Figure 12£L 
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ACTJI la P-nanaoaidQDo (fiDOBl) (contlauod 



549 558 567 576 585 594 

ACG ACA CTT GTT GAG TTT GCC AAG TAT GCT GCT TAG ATC GCC CAT GCG CTC GGA 

Arg Thr Val Val Glu Phe Ah Lys Tyr Ala Ala Tyr lie Ala Hil Ala [eu Gly 

603 612 621 630 639 6 48 . 

GAC CTC GTG GAC ACA TCC AGC ACC TTC AAC GAA CCT ATG GTA GTT GTG GAG CTC 

Asp Leu Val Asp Thr Trp Ser Thr Phe Asn Glu Pro Met Val Val Vol Glu L^u 

657 6*6 675 684 693 702 

GGC TAC CTC GCC CCC TAC TCA GGA TTT CCC CCG GGA GTC ATG AAC CCC GAG GCC 

Gly Tyr Leu Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Asn Pro Glu Ala 

711 720 729 738 747 756 

GCG AAG CTG GCG ATC CTC AAC ATG ATA AAC CCC CAC GCC TTC GGA TAT AAG ATG 

Ala Lys Leu Ala Ilo Leu Asn Met Ilo Asn Ala His Ala Leu Ala Tyr Lys Met 

765 774 783 732 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAG GAT AGC AAG TCC CCT GCG GAC 

He Lys Axg Phe Asp Thr Lys Lys Ala Asp Glu Asp Ser Lys Ser Pro Ala Asp 

819 828 837 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC GCT CTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly lie lie Tyr Asn Asn II* Oly Val Ala Tyr Pro Lys Asp Pro Asn Asp 

873 832 891 900 909 918 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC AGC GGA CTG TTC 



Pro Lys Asp Val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly 



Leu Phe 



927 936 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAG TTC GAC GGC GAA AAC TTT 



Phe Asp Ala He His Lys Gly Lys Leu Asn He Glu Phe Asp Gly ciu Asn 



Phe 



981 990 999 1008 1017 1026 

GTA AAA GTT AGA CAC CTA AAA GGC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val Lys Val Arg His Leu Lys Gly Asn Asp Trp Ilo Gly Leu Aan Tyr Tyr Thr 

1035 1044 1053 1062 1071 1080 

CGC GAG GTT GTT AGA TAT TCG GAG CCC AAG TTC CCA AGT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser He Pro Leu He Sar 

Figure 12b( Continued) 
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WB lQ '-°°»—"<">o < 830B1) (continue*, 



1089 1098 UC7 ms 



«C MO CCC „ C_C_c AAC TAC CCC TAC TCC W ™ ccc ^ ^ 

- «y v. a Pro ^ ^ cly ^ - ~ ~ --- --- ... ... 

li43 1152 iifii 

« «C „ CCC GTC AGC CAT AK CCC JCC ^ „ r «»• „ ^W. 

A.p Oly « -„ » tl ,„ „„ Il0 wy ~ ~ -~ ~ - ~ --- ... _ 

=f « .«.« »CC «^ „ „ IAC ^ B ^ 

Asp Sor He Val Glu Ala TVr r*~, « 

Ala T*r Lys Tyx Ser Val Pro Val Tyr Val Thr Glu Asn 

"SI 1260 1269 .„„ 

«« «T CCO CAT « CCC GAC ACC CTG ACG CCa'S TAC ATA^GTC AGC CAC^ 

Cly Val Ala ^ Ser M . ^ m uu ~ ~ ^ ~ ~ --- --- ... 

TCA AAC 1 ^ CAc ^ A „ ^Jg ^ ^ ^ ^13«1 ^ ^350 
«« I*. He 81u 01- Al a Xle Clu Asn « y £ Pro ^" ~ ~ ~ ~ ~ 

~ ™™ ™ ™ ™ «» « « CTC^ TTC AGC^ATG ACC TTt'SJ 

Trp Ala u« Thr A« P Asn Tvr Clu Trp AU £ c7 y £ £ £ ~ ~ ™ 

1422 1431 

««««««« 

t„ W . »„ „. s „ ly , „, - ™ ~ ~ -~ --- ... 

1467 147$ 14fl5 

CAC -A TAT CCC ACC ATA GTG CAC TCC AAC CCT AAG^ ATC ^ 

«« Xle Tyr Ar, at, Il0 val Cln Ser Asn ^ Val" Pro £ ~ ~ ~ ~ 

"ai 1530 1539 

GAG TTC CTG AAG GGT GAG GAG AAA TGA 3' 

Glu Phe Lau Lye Gly Glu Glu Lys 



Figure 12C{Continued) 
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OCl/flV Eadoglacoaaoo (33G*i) 
5- AW GTA GAA AGA CAC TTC ACA TAT GTT CTT ATT WC ACC CTO TTT CTT GTT AW 

M.e Vai Glu ax 5 « is Pha at, ^ n; n; - - - e - - - : - --- 

CTC CTA ATC TCA TCC ACT CAG TGT GGA AAA AAT GAA CCA AAC AGA m 111 
Leu Lau Xle S« ser Tn* Gin c£ £ £ ^ ^ £ ~ £ - »- 

117 126 135 144 

Ser Met Glu Gin Ser Val Ala «~ Z L*p" III Asn s'lr ^ £ Glu" Tyr As~„ 

171 180 189 198 - A , 

^ *!! fl* !fl ™ ^ ™ ™ A " «» "* «T ™ CAA S OCT TTC £J 
Ly* «ec Val Cly l*. Gly 'vll As"n III « y ~ ^ ~ "~ ~" ~ ~ ~- 

225 234 343 2S2 2fi , 

~ ™ ^ ™ ™ ^ ^ ™ ™ CAA TAT TTT CAS ATA ATA AAG AAA 22 
Gly Ala Txp Gly Vai Arg He Gl~u Asp Cl*u iyl »o Gl'u HI 'ill ly a ^ ^ 

279 288 237 306 

GGA TTT GAT TCT CTT AGO ATT CCC ATA AGA TGG TCA GCA CAT ATA TCC GAA 22 

Gly Phe Aap Ser Val Arg n e Pr = lie" ^ Trp ^ H' B ~ ~ ~ £ 

CCA CCA TAT CAT ATT £c AGO AAT JS CTC GAA IS CTT AAC CAT GTT CTC CAT 

Pro Pro Tyr Asp lie Asp Arg Asn p"^ ~iZ clu Arg vll Zl vll Z' P 

387 396 405 4ii 

AGG GCT CTT GAG AAT AAT TTA ACA GTA ATC ATC AAT ACG CAC CAT TTT GAA GAA 

Arg Ala L*u Glu Asn Asn I^u Tnr ^ n~ e ^ L'n Tnr hII j£ £ ~ ~" 

~ ~I ™ ™ ff? ^ ^ f 30 «» °" ™ «« gaa aS wc aga *2 

Leu Tyr Gln Glu Pro Asp ~ ~ ~ ~ ^ ™ ~ ~- ~ --- 

ATT OCA AAA TTC TTT AAA GAT TAC CCC GAA AAT CTG TTC TTT GaI ATC TAC JS 
He Ala Lye Phe Ph, Lya Asp Tyr III HI ^ ^ ^ ^ ~ — — 

Figure 13CL. 
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OCi/av arfogiueoaaoo <3l»i) (continued) 
558 S67 57s 

GAG CCT CCT CAG AAC TTG ACA GCT GAA AAA TGG AAC GCA CTT SJ CCA AAA « 

Clu Pro Ala Gin Asn Lau Thr Ala Glu iyl Trp" Zl Zl Zl Zyl Zl ~ 

603 $12 621 

~ ™ ™ ^ ^ ™ ^ ^ « * CC CG3 «T =TC ATT ATC GAT GCT "a 
Leu Ly„ Val lie Arg Glu sll Zn p"ro Thr" ^ a," val" ^ ~ ~ ~ ™ 

657 666 675 £R4 

~ ™ ^ ™ !*I ™ ^ ™ ^ AGT CTA ™ BC GAC AAA S2 
Asn Trp Ala His Tyr Ser Ala Val Zl Zl Zl Zl Zl vll Zl Zl HI Z~ a 

711 720 ? 29 738 747 

ATC ATT GTT TCC TIC CAT TAC TAG GAA CCT TTC AAA TTC ACA CAT CAG CCT 

Ilo He Val Ser Pho Hi 9 Tyr Tyr Glu Pro III Zl ^o Tto ^ cZ Zy Zl 

^ 5 774 783 793 fioi 

™ !!??!? ™. ff f A !f ^ f CT GTT AG0 **» «» *** «c ga C ^ JJ2 

Glu Trp val As* Pro Il e Pro Pro V al Zl Zl Lys Trp" «y Glu Glu 



813 828 «7 846 355 

~ -!! ^ ^ A !! ^ ACT aT TO TAC 01X3 GAC TGG GCA AAG 

Glu lie Asa Gin Ilo Arg Ser His Phe Lyl Tyr vil Ser Zip Try" AU Ly^ Gin 

873 882 851 900 on? ... 

Aon Asn Val Pro He Pho Leu Gly Glu Phe Gly Al'a Tyr Ser Zl Zl Z~» Zl 

537 836 945 »54 963 o-,, 

*^ '* AA AGA AAA CAA TTT GGA 

Asp Ser Arg Val Lys Trp Thr Glu Ser Zl Zl Zl Zl Zl III Zl Zl Zy 

981 990 9 »9 1008 inn 

HI If- !*? ^ ™ ^ ^ ™ ™ «A CGA TTT GGC ATA TAC GAT AGA TGG 

Pho Ser Tyr Ala Tyr Trp Glu Phe Cys Ala Zl Zl Gly III Tyr Zl Zl Zl 
1035 1044 "53 1062 in-n lnon 

I" ft* "f I?! A If I" II? CCA ACA «* G ™ «* a CA ggc aaa gag 

Ser Gin Asn Trp Ile Glu Pro Leu Ala Thr Ala Val Zl Zl Zl Zl'y Z's Zl 
TAA 3 ' 



Figure 13b (Continued 
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Mec A SP , eu Thr , y , Vil cly Ile „, ^ ~ - ~ - ~ --- --- --. 

Asp Val Ala Lye Asp Arc Phe n.. ri~~ 7""" 

P Arg Phe lie ciu lie Lys Asp Cly Lys Ala Clu Val Txp 

117 126 135 n/i 

-! - !!! ^ « ™ ~ «» «» «C TAC GAA AAA CCA CAC £ TCT CCC IS 
II- I#au CI* Cl y Val ciu Ciu Uo ^ £ Glu " i~ £ £ £ ~ ~ ~ 

II. Ph. P ha Ala Gla Ala ^ ser ^ ~ ~ ^ -~ - ~ ™ --- --- 

225 23 < 243 2«;3 

-:: .!! ™^™*"»*«*<««e x* 8 ^ act «, 22 ^ L° 
pro v. x Thr Lys Lys Ly8 Glu ^ u ~ ~ » - --- --- ... 

279 288 297 3 a< 

-IT ^ ™ ^ ™^ OAT CCC A« GAC ATA HI m ACS £ 

He Pro Val Ser Ar g val Glu Ly , Ala £ Pro ^ ~ ~ ~ - ~ - 

I--!!!^^!!!^ TCT ^Sc ro AAA^^^ c ^ AW ^^ 
ryr vai at, Ila w ^ u Ser Glu S9r ~ ~ » : - - ~ ... 

VaX « u Uu X1 . lle Glu Qly ^ ^ ^ ~ ~ ~- --- ... _.. 

441 450 icq 

~ ^ ^ ™ ™ - - -A CAC CTC OCA S CTA TAT J£ CCA GAG 25 
L eu Asp Asp Tyr Tyr Tyr Asp ci y Glu £ ^ ~ ~ ~ ~ ~" ~ ~ 

495 504 513 535 

ACQ ATA ™^« Wweccmww g flWAw «l mnc lJ. 

Thx lie P h * Ar B Val Trp Ser Pro v."I £ £ £ ~ ~ ~ ~" ~ £ 

Figure 14*— 
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Tho ro oto SO nasitinc Pullulonaao < 60 p 3 ) < CMtlauod 



549 558 567 576 



) 

585 



Ly» A S n «y Glu Asp Thr Glu ^ ~ ~ ~ ~ ~ ~ - ~- --- 

603 612 62 i 

AAC COO OTC TOG CAA GCC OTP CTT CAA GCC GAT S CAC CGA £ TO TAC £ 

Asn Oly Val Trp Clu Ma Vol Va" «u' ^ ~ ~ ~ ~ ~ - ~ ~ 

657 666 675 fiB4 £D . 

TAT CAG CTG GAA AAC TAC GCA AAG ATC ACA ACA ACC GTC GAT £ TAT TCG AAA 

Tvr Gin Lou G lu Aan Tyr Oly Lys n"o" Arg ^ Tnr Val " Asp Pro Tyr s« Zy.' 

711 720 729 738 -w-, 

™ ^ 3« G« AAC AAC CAA GAG AGC GCC GTT CTG AAT CTT GCC AGS ACA MC 

Ala VI Tyr AU Aan Aan Gin Glu Ser Al'a V a "i vll Asa ~iZ HI Zn 



765 77 « 783 792 801 



-~ ™ ^ !?! ™ ™ ™™ « 668 ™ GGA TAg" GAA GAC £ 

Pro Clu Gly Trp Glu Asn Aap Arg Gl'y III £ HI III Gly Tyr clu ™ ~ 

819 828 8 " 846 855 

-I- I" ™ I™ ^ *™ ^ ^ * TC CTC GAA AAc" TCC GGG CTA 

He Ho Tyr Glu lie Hio Ilo Ma lap He Thr Gly Leu clu A^n sir Gly vll 

973 882 "1 S00 90« 

~ ~- ~ !!? ™ ™! !!! « ™ «* «* JS GGA CCG « 

Lys A*n Ly, Gly Leu Tyr Lou Gly Uu T^r oiu III Aan Tar Zya Gl"y Pro Cl'y 

927 936 

~! !I? !°! ?H If? ^ c ?™ ^ «** « JS cac ot S t 

Gly val Thr Thr Gly Lou Ser Hi* Leu v«I Glu Leu oly Val Thr vll His 

-I- !T! !™ ™ H! ?! ™ ™ *=? « ^ gat rrc 1 ^ 

Ho Leu Pro Ph„ Phe A S p Phe ^ ^ ™ c ~ ~ ~ ^ ~; ™ 

1035 1044 1053 10G3 , Ml 

AAG TAC TAC MC TOG GOT TAC GAT CCT TAC CTG TTC ATG CTT CCG GAG GGC^AGA 

Lys Tyr Tyr Aan Trp Gly Tyr Aap Yrl Tyr Leu ^ ^ ~ G ^ ~; 

Figure 14b< Continued) 
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1085 1098 no? 

~ » *C « « AM AAC CCA CAC ACC AGA GAA^CTC AAA gaa'£ 

*. «. ». A„ ,„ t „ _ = - - ~ ~- --- ... ... ... 



« « CAC „■» OCT ATC CAc'^ „ 

V, * AU ^ «. L „ „, My ~ ~ - - -- -- -.. ... ... ... 

1157 1206 i 3l5 

- w My „. =ly a , „ u - - - » = - ... _ 

1251 1260 i3fifl 

™ « « « ~ - « « « » ^'S » ^ TO ^"M 

- *. *. H. „ Lya Tlir Gly Ala Tyx Lau £ £ s ~ ~ ~ » - 

1305 1^14 1321 

AA.^ APA ACC CXC'S 

Val He Ala Ser Clu Ar-r ?r rt w 

u *~ 2 Met Met Arg Lvs pk#> *n« v-i * 

i-ys fne He Val Asp Thr Val Thr 

« AA= CAG^ CAC „*£ „ TTC GAT^CAS ATC „™ 

*~ ^ V.l 01 „ ^ Hi „ ^ - - - - --- _ ... _. 

~ ~ ^ ~™ « « « « „ „»« « 

u. Ly „ ^ „« ^ Mu v>1 » ~ ~ : -- --- ... ... _ „. 

if! ~ = « ~ « - « - » - GCA^CCG ATC 

»r ^, ^ aiy 0IU tto ^ m ; - ;; --- ~- ... _. ... 

S S?S ?! ~5 « « « - « «•£ «. 

«y w.^ « v.. „. oly „, ~ ~ - - - = - --- --. 

~ ~~ ^ ~~ ~ ~ « « «. gga'ttc g„ a„'S 

- «• a. A,, », s „ M „,. A .„ ~ ~ ~" ~ ~ - -■- --- --- 

Figure 14C(Continued) 
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now,, naritl^ SB ai. lMQO o lSW 3, ,co Bt i Mo4) 
= ~~ 2? ~« - ^ - - CCA^ ATA AAC^ 

ciy ryr oi y Lys Gau Thr Lys Ile Lys Arg - ~- -~ - ; --- --- 

1683 1692 1701 

"p oiy ^ n. Ly , s . r ^ - ~ ~ ~ »- - --- 

173? 1746 1755 

CCA GCO TGT CAC GAC AAC CAC ACA CTG TGG «xr Hi ^ 1773 "82 
^ ^UV CTG TCG GAC AAO AAC TAC CTT GCC GCC AAA 

Ala Ala Cys Hio Asp Aan His Thr- t^.. * T"" 7"~ " " — 

P Hi3 Thr Uu Tr * A °P Lys A*n Tyr Lou Ala Ala Lye 

1791 1800 1809 IRIS 

««:^^«*~**«A«X GAA CTG AAA AAC^C CAC AAA^CTG 
Ala Asp Ly 8 Lys , ys Glu ^ ~ ~ ~ Glu £ £ ^ aTI ata £ ^ 

« ^ A. AC, „»» «'S « « GGG^CAG 

oiy Ala zia Lou u» Thr Ser Cla « y Vtti ?ro £ ~ ~ ~ ~ ~ 

1899 1908 1917 

«c rrc tcc acc acc acc aat ™ aac cac aac'S tac aac«c cct a TC ^ 

A-P Pho cy 0 Ax 3 Thr fc Asrj £ Zl ^ £ £ ^ l" ~ ~ ~ ~ 

1953 19S2 i97i 10fln 

ATA AAC OCC TTC GAT TAC OAA ACA AAA CTT CAG TTC ATA CAC^ TTC AAT^ 

He as, cly Pha Asp ^ Bltt - - - - --- --- --- --- 

2007 2016 2025 5 oia 

~ ^ !?T™ ™ ^ ™ -C 2 CCT CCT ^JS W AAA 2 ^ 

His .y, Gly ^ Ile tya Uu ^ Ly ; - ~- ; -- -~ --- --- --- 

2061 207 0 2079 ^ftfto 

OCT CAA CAC ATC AAA AAA CAC c*. CAA „ m ™ « ^ ^ 

Ala «« Clu lie Ly8 Hi8 Lau clu - ~ ~ ~- ~ Z ~ --- --- 

2115 2124 2133 31n 

GCC TTC ATC CTT AAA GAC « « „ „ ^ « AAA GAC ATC GTG^GTG 

Ala Pha Met Leu Lys Asp His Ala Gly Glv Ann Prn ^11 7"" T 

y Uiy Asp Pro Tr P Lys Aap Ho Val Val 

Figure 14a(Continued) 
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2169 2178 210? 

«• „ y „„ Ull 01u ^n,™" ~ ™ ~. ~ »- -•- ... 

2223 2232 77Ai 

*" ""J" v " *" s " ° ta l " " : " ; "* ~ ol: ;:; il ° ° i: ™ ; ;:; «»" 

~ ~ ~ ~ 2 - » « « - «. ™£ 

oiy.Tte ii. «. u» „ .„ ,.,„ s ,; - - ~ - ™ ~- --- 



Figure He< Continued) 
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Figure 15. Thennotog* «ri ci„ MSB 8 (Clone # 6G P2) Glycoside 

1 

CTT TTA TTG ATC GTT GAG CTC TCT TTC GTT CTC TTT GCA AGT GAC GAG TTC 

Leu Leu Xle Val Glu Leu Ser Pne v.l Leu Phe Ala Ser Asp Glu 

GTG AAA GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA AAA GAA TTC AGA TTC 

vi ror- vaa gi u Asn aly Lys Phe Ala Leu Asn Qly Lys ^ ^ £ ™ 

ATT GGA AGC AAC AAC TAC TAC ATG CAC TAC AAG AGC AAC GGA ATQ ATA GAC 
XX. Gly Ser Asn Asn Tyr Tyr Mec His Tyr Lys Ser Asn Gly Mec ™ £ 

se" vai r r agt gcc aga ^ atg ^ ata ^ gtc CTC AGA ATC ™ 

v al Leu Glu ser Ala Arg Asp Met Gly He Lys Val Leu Ar 3 He Trp 

GGT TTC CTC GAC GOG GAG AGT TAC TGC AGA GAC AAG AAC ACC 7,C A^G CA^ 
Oly Phe Leu Asp 01y Glu ser ^ cys Arg As? Lys ^ £° £ 

p" Z ™ °" ^ - C «° *K 

3GT TTC GAA AGA CTC GAC TAC ACA GTT GCG AAA GCG AAA GAA CTC GGT ATA 
G-y Phe Glu Arg Leu Asp Tyr Thr Val Ala Lys Ala Lys Glu Leu Gly He 

AAA CTT GTC ATT GTT CTT GTG AAC AAC TGG GAC GAC TTC GGT GGA ATG AAC 
L y S Leu Val He Val Leu Val Asn Asn Trp Asp Asp Phe Gly Gly Met ^ 

CAG TAC GTG AGG TGG TTT GGA GGA ACC CAT CAC GAC GAT TTC TAC AGA GAT 
Cln Tyr Val Arg Trp Phe Gly Gly Thr His His Asp Asp P he Tyr Arg Asp 

GAG AAG ATC AAA GAA GAG TAC AAA AAG TAC GTC TCC TTT CTC GTA AAC CAT 
Glu Lys lie Lys Glu Glu Tyr Lys Lys Tyr Val Ser Phe Leu Val Asn His 

OTC AAT ACC TAC ACG GGA GTT CCT TAC AGG GAA GAG CCC ACC ATC ATG GCC 
Val Asn Thr Tyr Thr Gly Val Pro Tyr Arg Glu Glu Pro Thr lie Met Ala 

Trp G Glu T T ^ ^ TCG GGG ™ *** 

^ Glu Leu Ala Asn Glu Pro Arg Cys Glu Thr Asp L ys Ser Gly Asn Thr 
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TIC AAA CCT TAC OCT GGA GAA GCC GAG TGG GCC TAG AAC GGC TGG TCC GGT 
Phe Lys Pro ^ Gly ny fllu AU Qlu ^ Ma ^ « XCC GGT 

ITl Z r T ^ ^ GA ° ACC GTG GAC ™ «* ACG TTC 

Val Asp Trp Lys Lys Leu Leu Ser He Glu Thr v al Asp Phe Gly Thr pTe 

«x. «-» Tyr Pro Ser H is Trp Gly val ssr pro Glu « ™ ™ 

^ t t : : ta r gac cac - - - - - - GGA ^ 

-/ Trp . ie Ox« Asp His lie Lys He Ala Lys 01u n . Gly LyB 

CCC GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GT- AA~ AGA 

-ro v« vai Leu 01u 01tt ^ Gly Ile ?ro Lys 5er Aia p ; o - - - 

?n I 0 z r - tc ?so ™ gat ctg •* ™ « - «* - - 

lie Tyr Ar 3 ,eu Trp As, As P LeU Val Tyr Asp Leu Gly Gly Asp 
Ty Ala H r " C GGA ^ GGT - AGA GAC 

GAG AGA GGG TAC TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG AAC GAC GAC 
«« Cly Tyr Tyr Pro Asp Tyr Asp Gly Phe ^ a . ^ J £ ~ 

z n: 2 r r ctg ata aga gaa ™ « - - ™ aa C aca GG t 

P-o Glu Ala Glu Leu lie Arg Glu Tyr Ala Lys Leu p h „ _ 



Asn Thr Gly 



GAA GAC ATA AGA GAA GAC ACC TGC TCT TTC ATC C~T re* m n« 
Glu Asn r*, CA AAA GAC GGC ATG 

Asp xi. Arg Glu Asp Thr Cys Ser Phe He Leu Pro Lys Asp Gly Met 

CAO ATC AAA AAG ACC GTG GAA GTG AGG OCT GGT GTT TTC GAC TAC ACC AAC 



Figure 15b (continued) 
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- Glu He Lys Lys Thr Val Glu Val Arg Ala Gly Val Phe Asp ryr Ser Asn 

ACG TTT GAA AAG TTG TCT GTC AAA GTC GAA GAT CTG GTT TTT GAA AAT GAG 

Thr Phe Glu Lys Leu Ser Val Lys Val Glu Asp Leu Val Phe Glu Asn Glu 

MX GAG CAT CTC GGA TAC GGA ATT TAC GGC TTT GAT CTC GAC ACA ACC CGG 
He Glu His Leu Gly Tyr Gly lie Tyr Gly Pne Asp Leu Asp Thr Thr Arg 

ATC CCG GAT GGA GAA CAT GAA ATG TTC CTT GAA GGC CAC TTT CAG GGA AAA 
He Pro Asp Gly Glu His Glu Met Phe Leu Glu Gly His Phe Gin Gly Lys 

ACG GTG AAA GAC TCT ATC AAA GCG AAA GTG GTG AAC GAA GCA CGG TAC GTG 
Thr Val Lys Asp Ser lie Lys Ala Lys Val Val Asa Glu Ala Arg Tyr Val 

CTC GCA GAG GAA GTT GAT TTT TCC TCT CCA GAA GAG GTG AAA AAC TGG TGG 
Leu Ala Glu Glu Val Asp Phe Ser Ser Pro Glu Glu Val Lys Asr. Trp Trp 

AAC AGC GGA ACC TGG CAG GCA GAG TTC GGG TCA CCT GAC ATT GAA TGG AAC 
Asn ser Gly Thr Trp Gin Ala Glu Phe Gly Ser ? ro Asp lie Glu trp Asn 

GGT GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC GTG AAA CTG CCC GGA AAG 
Gly Glu Val Gly Asr. Gly Ala Leu Gin Leu Asn Val Lys Leu Pro Gly Lys 

AGC GAC TGG GAA GAA GTG AGA GTA GCA AGG AAG TTC GAA AGA CTC TCA GAA 
Ser Asp Trp Glu Glu Val Arg Val Ala Arg Lys Phe Glu Arg Leu Ser Glu 

TGT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA AAC GTC GAG GGA CTC AAG 
Cys Glu lie Leu Glu Tyr Asp lie Tyr lie Pro Asn Val Glu Gly Leu Lys 

GGA AGO TTG AGG CCG TAC GCG GTT CTG AAC CCC GGC TGG GTG AAG ATA GGC 
Gly Axg Leu Arg Pro Tyr Ala Val Leu Asn Pro Gly Trp val Lys lie Gly 

CTC GAC ATG AAC AAC GCG AAC GTG GAA ACT GCG GAG ATC ATC ACT TTC GGC 
Leu Asp Met Asn Asn Ala Asn Val Glu Ser Ala Glu He n e Thr ?he Gly 

GGA AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT GAG TTC GAC AGA ACA GCG 
Gly Lys Glu Tyr Arg Arg Phe His Val Arg He Glu Phe Asp Arg Thr Ala 



Figure 15C(continued) 
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GGG GTG AAA GAA CTT CAC ATA GGA GTT 
Gly Val Lys Glu Leu His lie- Gly Val 

GGA CCG ATT TTC ATC GAT AAT GTG AGA 
Gly Pro lie Phe lie Asp Asn Val Arg 

TGA 1991 
END 



GTC GGT GAT CAT CTG AG" TAC GAT 
Val Gly Asp His Leu Arg Tyr Asp 

CTT TAT AAA AGA ACA GGA GGT ATG 
Leu Tyr Lys Arg Thr Gly Gly Met 



Figure 15d(continued) 
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Figure No. l^Therraocoga maritima MSB8(6gb4) 
1 ATG AAA AGA ATC GAC CTG AAT GGT TTC TGG ACC GTT AGG GAT AAC GAA GGG Af*a ttt t» 

i - nr. *, ne ASP Le , Asn ciy Phe Tr? ser val ^ Asp Z 2 T y £ Z 

a " r !?* ^ ^ G?C CA ° GCA GM CT ° «* ^A AAA GGT CT^ „ CCA 

». oiu Giy nr v al Pr = 01y val val olB Ala Asp Uu vai ^ «• «t c« 

\\ i r r tac gtt ggg atg ^ gaa gat ctc ^ **» gaa «» - *« « ^ A ^ c 

41 »x. Pro Tyr Val Cly MeC Asn Ciu Asp Leu Ph e Lys Bln Ila Glu Agp ^ J ™ £ 

m TAC GAG AGG GAG TTC GAG TTC AAA GAA GAT GTS AAA GAS GGG GAA CGT GTC GAT CTC GT~ 
« Tyr Clu Arg Glu P he Glu P he Lys 01tt A3p vai Lys Clu Gly ™ 

24i ttt gag ggc gtc gac acg ctg tcg gat GTT tat ctg AAC GGT GTT tac c^t gga AG C ,rr 
a: Phe Clu Ciy Val ASP Thr Leu Ser As P V .l Tyr Leu A sn ciy v " I 



so 

20 

120 
40 

180 
60 

240 

so 



300 

'yr Leu Gly Ser Thr ICO 



301 GAA CAC ATG TTC ATC GAG TAT CGC TTC GAT GTC ACG AAC GT3 TTG AAA GAA AAC AAT CA~ 3 «, 



^ys Asn His 12c 



480 

160 



361 CTG AAG GTG TAC ATA AAA TCT CZZ ATC AGA GTT CCG AAA ACT C— -A- r-- ^ 
,21 „.« Ly. Vai ,yr n e , ys 3 er Pro :ie Ar 3 Val Pro ty, T,r Leu Glu Gin Asn Tyr Giy 

m vl' r "* " C WA GGA TAC A?A ACA GCC «° ™ TCG TAC 

141 Va. Leu G1 y ciy Pro Glu Asp Pro He Ar 9 Ciy Tyr lie Ar S Lys Ala Gin Tyr Ser Tyr 

2 T T TGG GG? GCC ^ *" 108 "» «* CTC TAC CTC UAG 

S4X CTC TAC AGG GCA CGT CTT CAG GAT TCA ACG GCT TAT CTG TTG GAA CTT GAG GGG AAA GAT 
Tyr Arg Ala Arg Leu Gin Asp Ser Thr Ala Tyr Leu Leu Glu Leu Glu Glv Lys Asp 

2 Z T AGG G?G ^ ^ " C ^ ^ GAA ^ " C A " GTG GAA GTT TAT 66 0 

201 Ala Leu Val Arg Val Asn Gly Phe Va! His Giy Olu Gly Asn Leu He Vai Glu Val Tyr 220 

Z vll r T ^ * ^ GGG GAG "* ° TT CTT GAA « **= «» «** **» CTC TTC 7 2 0 
Vai Asn Gly Glu Lys He Gly Glu Phe Pro Val Leu Glu Lys Asn Giy 01u Lys Leu Phe 2 <0 

™ Z Z Z Z c CAC CTG " CAT G7G CTA TGG TAT CCG tgg - c gtg g - - «* 

Gly H " L6U ^ S AS P Val L ^ Trp Tyr Pro Trp Asn Vai Gly Lys P ro 



GAG 54 0 
180 

600 
200 



GTG GGG AAA CCG 78 0 
260 
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= ^^^ZZZZZZZZZZZZZ :;: 

^^zzzzzzzzzzzzzzz 
™~-zzzzzzzzzzzzz ;:: 

= --^ZZZZZZZZZZZZZZZ Z 

" = = = = r = = - « - : - - - - = - = - ~ 

: ?«= ,0, CTC TOT CAT M „C =C ATC ATG 0?= Too M „ ? ^ ., 

** *- ~ ci - - - - •* ... ... „ „ «. £ z ~ z ":: 

1141 GAA TAT CC3 GAT CAT CT7 CCS TGG TTC AGA AAA - 

33X Glu Tyr Pro Asp His Lau Pr3 T „ p , e ^ ~^ . ^ ^ ^ GA2 3CA AGA ™> ATT 1200 

P-e Ar 3 ^. .eu A.. Asr. oiu Glu Ala Arg ^ 1U ^ 

1201 GTG AGA AAA CTC AGA TAC CAT CCC TC" A- GT- C~r — T ~ 

« « - * «, ,„ s „ :;; « ;: ; £ « - « - « - 

«i/ «sn asp. jiu ^ sn Aan 420 

12S1 TGG GGA TTC GAT GAA TGG GGA AAT A-G GC- *r. a », 

« «, «, ». « P «. „ «, Z « - - « - « « - « 

* ^ys va. Asp „i y L . e Asa Leu Giy Asn 44=) 

« = = - = = = = = = z - z z = - = z = = - ~ 

- = = = = = - - s = z z z z z z z z z z z ■ 

- "zzzzzzzzzzzzzzzzzzz z 

"»» TTO ATC mc OAG TTT CCA TTT CIO OOT OCT CCC CAT CCA OAO .-v 

■•. - ... - ... ... », », 01 „ 01y ». P „ „: z z : c : c ™ z ra isto 

inr lie Glu ?he Phe Ser 520 

156 1 AAA CCC GAG GAA AGA GAG ATA TTC CAT CCC GTr «. 

■» ^ - «. ... a. c 10 ,, Pht ~ - z z Z Z Z Z G " ~ 

-ys his Asn Lys Gin Val Giu 540 
Figure 1 6b(continued) 
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z ^^zzzzzzzzzzzzzz z 

1741 TGG CGA AGC AGO AAG TAG AAA ACG GCC GGC GCT CTC ~C TGG CAG -r I.- 

- - «, * ^ w Iht „. „, leu » « z z z z z 

"01 CCO CTC TTC AGC TOO TCC OCA OTC «, TAC TTC AAA AGO CCC AAA OCT CTC -Ac -,c 
«■ » V., «. s„ T„ s. c Ala Vll Asp Tyr Phe : y . ^ Z ZZZ Z Z Z Z 

Z Z ZZZ Z Z Z Z T T " ™ Ata « « » - - - 

»2> CTO CTG CTG GOT GAG CGA TCI GAG GGA GAC AAA AOA AG? CTC -„ CAG „~ , C AG- r-. -.. 

l v " «* - - « - °* - w «. ... s : ot * ™ 

198! CCA OAA GAA GGG AGA AAA GGT ATT CGA AAA GAC TTA CAG AA- GGT AC~ — ,„ ^ 

" s - - «' - - •» - - * - - - z z z z z «:: 

2041 TGT GAG TTT GGT TGA 2055 
681 Cys Glu Phe Gly End 685 



Figure 



16c(continued) 



WO 98/24799 



41/46 



PCT/US97/22623 



Figure No. lfc-Bankia gouldi (37gp«) 

1 ATG AAA AAA AAT CTA CTA ATG TTT AAA AGO CTT ACG TAT CTX CCT TTO TTT TTA ATG CTG 
1 Met Lys Lys Asn Leu Leu Mec Phe Lys Ar 9 Leu Thr Tyr Leu Pro Leu Phe Leu Met Leu 



60 
20 



SI CTC TCA CTA ACT TCA GTA GCT CAA TCT CCT GTA GAA AAA CAT GGC CGT TTA CAA GTT GAC 120 
21 Leu Ser Leu Ser Ser Val Ala Gin Ser Pro Val Glu Lys His Gly Arg Leu Gin Val Asp ' 40 



121 GGA AAC CGC ATT CTT AAT GCG TCT GGA GAA ATT ACG AGC TTA GCT GGT AAC AGC CTC TTT „ u 
41 Gly Asn Arg He Leu Asn Ala Ser Gly Glu lie Thr Ser Leu Ala Gly Asn Ser Leu Phe 60 



180 



181 TGG AGT AAT GCT GGA GAC ACC TCC GAT TTT TAT AAT GCA GAA ACT GTT GAT TTT TTA GCA 240 

61 Trp Ser Asn Ala Gly Asp Thr Ser Asp Phe Tyr Asn Ala Glu Thr Val Asp Phe Leu Ala 80 

241 GAA AAC TGG AAT AGC TCA CTT ATT AGA ATA GCT ATG GGC GTA AAA GAA AAT TGG GAT GGC 300 

81 Glu Asn Trp Asn Ser Ser Leu lie Arg lie Ala Met Gly Val Lys Glu Asn Trp Asp Gly 100 

3 01 GGA AAT GGC TAT ATT GAT AGT CCG CAG GAG CAA GAA GCT AAA ATT AGA AAA GTT ATT GAT 360 

-01 Gly As.n Gly Tyr lie Asp Ser Pro Sir. Glu Gin Glu Ala Lys He Arg Lys Val He Asp 120 

361 GCA GCT ATT GCT AAC GGC ATA TAT GTA ATA ATA GAC TGG CAC ACT CAC GAA GCA GAG TTA 420 

121 Ala Ala lie Ala Asn Gly He Tyr Val ll e n e Asp Trp His Thr His Glu Ala Glu Leu 



140 



«i TAC ACA GAT GAG GCT GTT GAC TTT TTT ACC AGA ATG GCA GAC CTA TAC GGA GAT ACT CCC 
141 Tyr Thr Asp Glu Ala Val Asp Phe Phe Thr Arg Met Ala Asp Leu Tyr Gly Asp Thr Pro 160 



4B0 



481 AAT GTA ATG TAT GAA ATT TAT AAC GAG CCT ATA TAC CAA AGT TGG CCT GTT ATT AAG AAT 540 

161 Asn Val Met Tyr Glu He Tyr Asn Glu Pro lie Tyr Gin Ser Trp Pro Val lie Lys Asn 180 

541 TAT GCA GAG CAA GTA ATT GCT GGT ATA CGT TCT AAA GAC CCA GAT AAT TTA ATA ATT GTA 600 

181 Tyr Ala Glu Gin Val He Ala Gly 11. Arg Ser Lys Asp Pro Asp Asn Leu He He Val 200 

601 GGT ACT AGC AAT TAT TCT CAG CAA GTT GAT GTA GCA TCA GCA GAC CCA ATA TCT GAT ACT 660 

201 Gly Thr Ser Asn Tyr Ser Gin Gin Val Asp Val Ala Ser Ala Asp Pro He Ser Asp Thr 220 

661 AAT GTG GCA TAT ACT TTA CAT TTT TAT GCA GCA TTT AAC CCG CAT GAT AAC TTA AGA AAT 720 

221 Asn Val Ala Tyr Thr Leu His Phe Tyr Ala Ala Phe Asn Pro His Asp Asr. Leu Arg Asn 240 

"1 GTA GCA CAG ACA GCA TTA GAT AAT AAT GTT GCT TTG TTT GTT ACA GAA TGG GGT ACA ATT 780 

41 Val Ala Gin Thr Ala Leu Asp Asn Asn Val Ala Leu Phe Val Thr Glu Trp Gly Thr He 260 
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= ' = = = Z 2 Z 2 Z Z Z Z Z Z Z Z Z Z Z Z Z Z 

= -----ZZZZZ-ZZ2Z Z 

901 GGG TCT GTA GTT CAA GCA GGA CAA GGT GTA TT GGt a 

c-y Val ser 01/ Leu lie Ser A3n Lys Leu ^ ^ ^ 

"1 TCT GGT GAA ATT GTA AAA AAC ATC ATC CAA AAC ~G~ «», ar . 

- - «, «. „, ... ... m 01 „ « * - » « « - ~ « « , ;: 

- £ - s ^ z 2 - z r r ~ - ~~~~ » - 

G.u Cys He Arg Ala Ala Met Gl, Thr Ala Gin Ala 360 
1081 GGA GAT GAA ATT ATA ATT GCC CCT GGA AAC TAC AA- -~ 

"1 =ly Asp oiu lie lie He Ala P . 0 G - v " ^ *** ATA °* 007 ° CC 

P-o G.y Asn Tyr Asn Ph . Gin As ? Lys n . 01n G1/ AIfl J60 

11" TTT AAC CGT ACT GTT TAC CTT TAT GGT AG' GC~ AA- ~» ls , 
3 81 Phe Asn Arg Ser Val Tyr Leu *y- G - v - T ^ ^ CCT ATT A " A 12C = 

ryr Gi/ aer AJa Asn g:/ Asn £er ^ ^ ^ 

1201 TTA AGA GGC GAA AGC GCT ACA AAC CCT C~ G , r„ 

«1 X-u Ar 3 Gly Glu Ser Ala T,r Asn Pro P ' r ' 0 v £ £ ™ "* ^ ™ ™ «* »« 

Se - Leu A SP Tyr Asn Asn Gly 42 0 

1261 TAC CTA TTA ACT ATT GAA GGT GA~ 'AT TCC 11T ,t-. . 

«1 *r ,eu Leu Ser ne Glu Gly T Sp £ I" £ £ ^ T T "° ^ ^ ACT "30 

P y- -rp Asn He Lys Aap Ile Qlu phe Lys ^ ^ ^ 

1321 TCT AAA GGT ATT GTT CTT GAC AAT TCT AA" GGT AG- AA, ™ 

«« - «, w t .„ up „„ s „ z z z z z z z z z z z 

« z z s z z z z z z z z z z r r - - ~ - - - 

*-S Asp Gly ser Ser Asn Asn Ser lie Asp Gly «S0 

- * Z Z Z Z Z Z Z Z r» r """-* ~ 0OT ™ - •» - >- 

Sly A., .»r .y, ,„ 51y pto my Mu ^ ^ ^ ^ 

z z Z Z 2 Z 21 z r T ~ TCT " c •* « « « « « ». 

/ vj-n rtis Asp .hr Tyr Glu Arq A 1 a K 

rg A.a A sn Asn Asn Thr He Glu Asn ' S20 

- = = = 5 = z z z z z z z z z z 2 z z z z z: 

Figure 17b (continued) 
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1741 GGT TCT GAA GTA ATA AAT ACT GGA GTA GAC TTT TA GAT AP-A r~, 

56! Gly Ser Glu Val Il e Asn Th- Gly Val a 9 „ pT T " MT ACA 18 

Asn to. Gly Val AS p Phe Leu Asp Arg Gly Thr Gly Phe Asn 



100 

Thr 600 



1801 GGT TTT AGA AAT GCA ATA T— p ™» ««-, 

... ——zzzzzzzzzzzzzzz 

lBfil TCA ACT GCT CGT AAA AAA CAA GGT TCT -CT GAA raa » — — „ 

« - * «. «, «. Bit « ~ 2 Z Z Z Z Z Z Z Z. l Z 

1921 AAC CCT AAT TCT GTT GAT TTT CCA A~A A— GA- r— ,r-« 

*<l Asn Pro Asn Ser v.l As ? P-e Z s t ^ "* ^ *** TTC 1980 

Asp P..e p., ae Ser Asp G , y Thr .. u ^ ^ ^ 

1981 TGC CCA GAT TGG AAT ATA GAA -p- „.. 

- - - - »• - - - " - - - - £ z z z z z z 

z z z z z z z z z z z z z z z - T " z -™ 

•n* -ej val ^.u Gly Tyr Asn Leu Gin Val 700 

2 " 2 Z Z S £ a" 7 T gat gga ac? att "* "* 3?A CTT ~ ata - - - 

A.a Thr Asp Ala Asp G ly Thr lie Asp Asn Val Ly S Leu Tyr Ile Asp Asa ?20 

£ T ^ ^ ^ ^ *» «« GAT TCT CCA AAT 222 0 

721 Asn Leu Val Arg Gin lie As- <5*r- ts^ c„ « , 

Thr Ser Tyr «*" Tr P Sly His Ser Asp Ser Pro Asn 740 
2221 ACA GAT GAA CTT AAT GGT CTT ACA GAA GGA ACT TAT ACC - A .„ r „ 

- - -p «. - - «, ... ~ «. «, „ z z z z z z z z z z: 

'Z z z z z z z Z Z Z 7 ra " s m m » ■» « - « »«° 

P o-y Aia se. Thr Glu Thr Gin Phe Thr Leu Thr Va' ii e Thr pi.. „ 

1 - e Thr Glu Gin Ser Pro 780 

2341 TCT GAG AAT TOT GAC TTT AAT ACA CCT TCT TCA ACT GGT ~» n 
'81 Ser Glu Asn Cys Asd Phe a., t* I °™ ° AT T " GAC ATT 2400 

Cys Asp Phe Asn Thr Pro Ser Ser Thr Gly Leu Glu Asp Phe Asp He Lys 800 

Figure 17<V( continued) 
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- - ~ « ... «. L „ ar s „ cl/ 01y , r „ sit tiu sit wn uo ^ _ ^ 

"« TTT ACT ATT AAT TGG AAT TCG CAA "AC AAT CCC rr, TAT CA. -rr -,' 

» - ~ a, _ „ a,. s „ ^ _ My i=u « ~ « * » « « aac „. 

« * " « ~ ™ z ™ „, 

Tyx Ty, I.. a S „ u. L». p„ „. T „ t ^ ^ ^ ^ ^ 

!S»1 OCA AAT CCA CAA ATA TCT ATT ACC AAT ACC TTA A-T C-T AAT -r- 

- «. - » „. S „ „. s „ _ s „ £ ~ « - « Z « ~ » ,..„ 

2641 GTA ACA.TCA GAT AAC GGT AAT TTT GTG ATG GTA TCT AAA hr~* 

- ~ » „ _ ^ „ „„. ^ » » » Z TTT „ .TA T.C „„ 

"01 TTT ACT AAT CAC OCT ACT OCT CCT ATT TO- AA- 0-r .r- — 

~ ~ ~ - »> «• * «• - u. £ Z Z : ~ £ r ? ~ - ~ 

.... ?.o Ser Asn G-n ne Ser Lys 92a 

2m ATT ACT GAT GAT TCT AGT ATT AAT TTT AAG ~ ,, c rpT 

He T.,r Asp Asp Ser se, : ie Asn ," e £ ^ I - " ^ ™ GAA J82 = 

-/« -« i/r ... as.-. Pro Ala Leu Asp oiu Thr 340 

2821 ATT TTT GTG AGC GCT GAA GAT GAA AAA CTA CCT TTG GTG ~ GT" r, 

941 ne phe vai s - »* ~ *■» «. «*. u u Ala u» v ;: :i: P c r: 2 ;;; 



Figure 17a (continued) 
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Figure No. ia<v Pyrococcus furiosus VC1 (7EG1) 

leader sequence; amino acids 1-24 

9 18 27 36 45 

5 ■ ATG AGC AAG AAA AAG TTC GTC ATC GTA TCT ATC TTA ACA ATC CTT TTA GTA CAG 
»t Ser Lys Lys Lys Phe val Ilfi Vfll ser ne ^ ne J ™ « 



« 72 

108 



81 90 99 



GCA ATA TAT TTT GTA GAA AAG TAT CAT ACC TCT GAG GAC AAG TCA ACT TCA AAT 
Ala lie Tyr Phe Val Glu Lys Tyr His Thr Ser Glu Asp Lys Ser Thr Ser Asn 



117 126 13S 144 1S3 



162 



ACC TCA TCT ACA CCA CCC CAA ACA ACA CTT TCC ACT ACC AAG GTT CTC AAG ATT 
Thr Ser Ser Thr Pro Pro Gin Thr Thr Leu Ser Thr Thr Lys Val Leu Lys He 



171 180 189 198 207 



216 



AGA TAC CCT GAT GAC GGT GAG TGG CCA GGA GC7 CCT ATT GAT AAG GAT GGT GAT 
Arg Tyr Pro Asp Asp Gly Glu Trp Pro Gly Ala Pro lie Asp Lys Asp Gly Asp 



225 234 243 252 261 

GGG AAC CCA GAA TTC TAC ATT GAA ATA AAC CTA TGG AAC ATT CTT AAT G~ AC- 
Gly Asn Pro Glu Phe Tyr He Glu lie Asn Leu Trp Asn lie Leu Asn Ala Thr 



279 2S8 297 30S 31s 



324 



GGA TTT GCT GAG ATG ACG TAC AAT TTA ACC AGC GGC GTC CTT CAC TAC GTC CAA 
Gly Phe Ala Glu Met Thr Tyr Asn Leu Thr Ser Gly Val Leu His Tyr Val Gin 



333 342 351 360 3 69 



378 



CAA CTT GAC AAC ATT GTC TTG AGG GAT AGA AGT AAT TGG GTG CAT GGA TAC CCC 
Gin Leu Asp Asn He Val Leu Arg Asp Arg Ser Asn Trp Val His Gly Tyr Pro 



387 396 405 414 423 



432 



GAA ATA TTC TAT GGA AAC AAG CCA TGG AAT GCA AAC TAC GCA ACT GAT GGC CCA 
Glu lie ?he Tyr Gly Asn Lys Pro Trp Asn Ala Asn Tyr Ala Thr Asp Gly Pro 



441 450 459 46e 

ATA CCA TTA CCC AGT AAA GTT TCA AAC CTA ACA GAC TTC TAT HI ACA ATC TCC 
He Pro Leu Pro Ser Lys Val Ser Asn Leu Thr Asp Phe Tyr Leu Thr lie Ser 
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S22 

TAT AAA CTT GAG CCC AAG AAC GGC CTG CCA ATT AAC ™ S4 ° 

^ - - «. , y , 01y to Ko » z z 2 ::: z 

549 558 S67 576 

TTA ACG AGA GAA GCT TGG AGA ACA ACA GGA *~r , 594 

u. r„ a . „, „ ^ £ « « - « J* „ „ „ CJA 0IA 

Asn Ser Asp Glu Gin Glu Val 
ATG ATA TGG ATT TAC TAT GAC GGA "a CAA CCG rr! 648 

~ - - - - ^ «, =t ~ :r ~ ™ « « « 

657 S6S 675 

ATT GTA GTC CCA ATA ATA GTT AAC GGA ACA CCA o-a ? ° 2 
XL VU Val Pro Ile Ile val Asn G y A £ £ -T ACA TTT GAA GTA 

Y -nr ?.o \a. Asn Ala Thr Phe Glu Val 

' 2 " 12 $ 73= 

TGG AAG GCA AAC ATT ^ _ " /4 ' 75 5 

T^p Lvs A!. T T " ^ AGA A ~ A ™ ACC CCA A~C 

Trp Lvs Ala Asn Ile 3 ,, Tr? Glu ^ ^ ^ ^ ^ ^ ^ ^ 

755 " 4 'S3 7S2 

AAA GAG GGA ACA GTG ACA ATT CCA TAC GGA GCA A ,. _ ^ 61 ° 

Glu Gl y , hr val Thr Ue Pro Tv r Gly Ala ^ n " ^ ^ ^ 

. "iy Ala Phe Ue Ser Val Ala Ala Asn 

819 828 837 Si - 

ATT TCA AGC TTA CCA AAT TAC ACA GAA CTT TAC TTA GAG am ~ ^ 

He Ser Ser Leu Pro As , _ r , r TAC ATT GGA 

-A Thr Glu Leu Tyr leu Glu Asp Val Glu IIe 01y 

873 882 891 go 

ACT GAG TTT GGA ACG CCA AGC ACT ACC TCC CCC CAC ^TA GAS IZ 
- Olu Phe M/ Thr pro ser Thr Thr s „ ^ - - - - « ATC ACA 

927 945 gS4 

AAC ATA ACA CTA ACT CCT CTA GAT AGA CCT CTT ATT TCC ~AA 3 ■ 
Asn He Thr Leu Thr Pro Leu Asp Arg Pro Leu lie Ser 



Figure 18b (continued) 
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